TRANSACTIONS 
| OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


EINAR HILLE C. C. MAC DUFFEE OSCAR ZARISKI 


WITH THE COOPERATION OF 


A. A. ALBERT NELSON DUNFORD W. K. FELLER 
T. H. HILDEBRANDT Ss. C. KLEENE R. E. LANGER 
SAUNDERS MACLANE OYSTEIN ORE H. P. ROBERTSON 
D. J. STRUIK J. L. SYNGE GABOR SZEGO 
HASSLER WHITNEY G. T. WHYBURN R. L. WILDER 


VOLUME 52 
JULY TO DECEMBER, 1942 


_ PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1942 


BOSTON UNIVERSITY 
OF LIBERAL ARTS 


| 
i} 
. 
| 


Composed, Printed and Bound by 
The 


Collegiate 
George Banta Publishing Company 
Menasha, Wisconsin 


TABLE OF CONTENTS 
VOLUME 52, JULY TO DECEMBER, 1942 


AGNEW, R. P. Analytic extension by Hausdorff methods 

Baer, R. A unified theory of projective spaces and finite abelian groups. 

Doos, J. L. Topics in the theory of Markoff chains 

Herriot, J. G. Nérlund summability of double Fourier series 

HItE, E. On the oscillation of differential transforms. II Characteristic 
series of boundary value problems 

KELLEY, J. L. Hyperspaces of a continuum 

Ko.cuHin, E. R. On the basis theorem for differential systems 

Lorcu, E. R. The spectrum of linear transformations 

McSHANE, E. J. Sufficient conditions for a weak relative minimum in 
the problem of Bolza 

MANDELBROJT, S., and Uvricu, F. E. On a generalization of the prob- 
lem of quasi-analyticity. 

Martin, M. H. The restricted problem of three bodies 

NIVEN, I. Quadratic Diophantine equations in the rational and quad- 
ratic fields 

PITCHER, E., and SmILey, M. F. Transitivities of betweenness 

Pérya, G. On converse gap theorems 

Périya, G., and WIENER, N. On the oscillation of the derivatives of a 
periodic function 

Rep, W. T. A new class of self-adjoint boundary value problems... . 

Rickart, C. E. Integration in a convex linear topological space 

Rosinson, R. M. Bounded univalent functions 

SEIDEL, W., and WALts#, J. L. On the derivatives of functions analytic 
in the unit circle and their radii of univalence and of p-valence. . . 

SmILey, M. F., and Pitcuer, E. Transitivities of betweenness 

SNAPPER, E. Structure of linear sets 

SzAsz, O. On the partial sums of harmonic developments and of power 


SzEG6, G. On the oscillation of differential transforms. I 

Uvricu, F. E., and MANDELBROJT, S. On a generalization of the prob- 
lem of quasi-analyticity 

Wats3, J. L., and SEmpEL, W. On the derivatives of functions analytic 
in the unit circle and their radii of univalence and of p-valence... 

WIENER, N., and Pérya, G. On the oscillation of the derivatives of a 
periodic function 


| 
F ool 
‘ 

217 
283 
37 | 
72 
463 
22 
115 
238 
344 
265 
522 
95 
65 
249 
381 
498 
426 
128 | 
257 i} 
450 
iW 
265 
128 
249 | 


QUADRATIC DIOPHANTINE EQUATIONS IN THE 
RATIONAL AND QUADRATIC FIELDS 


BY 
IVAN NIVEN 


1. Introduction and summary. It is well known that the equation 
(1) lz +my+n = 0, 


with rational integral coefficients, has either no solution in rational integers * 
or an infinite number of solutions. The same result is true in quadratic fields, 
that is, when /, m and m are integers of a given quadratic field, and solutions 
are sought among the integers of the field. 
We are here concerned with the number of integral solutions of the gen- 
eral quadratic equation 


(2) ax? + + cy? + dx+ey+/f=0, a #0;A = 5? — 4ac, 


with integral coefficients from the field of rational numbers or from some 

quadratic field. The quantity A is defined for convenient reference. We can 

take a0 without any loss of generality, by the use (if necessary) of linear 

transformations of determinant unity (so that the number of integral solu- 

tions is not changed). , 
First, suppose that the coefficients of (2) are rational integers. If A is 

negative, then the graph of (2) is finite in extent, and there is at most a finite | 

number of solutions in integers. If A 20, the graph of (2) is a parabola, an ' 

hyperbola, or two straight lines, and we prove the following result. 


THEOREM 1. Let the coefficients of equation (2) be rational integers, with 
A20. Then if (2) has one solution in integers, it has an infinite number, with 
the following exceptions: if (2) represents two essentially irrational straight lines, y 
it has at most one integral solution; if (2) is an hyperbola whose asymptotes are 
essentially rational, then it has at most a finite number of integral solutions. 


By an essentially rational straight line, we mean one whose equation can 
be put in the form (1), with rational integral coefficients; otherwise we say 
that a line is essentially irrational. } 

Next, suppose that the coefficients of (2) are from a real quadratic field. 
It turns out in this case that we can have an infinite number of integral solu- 
tions when the curve is finite in extent. Also, the hyperbola does not divide i 
into two cases, as it does in Theorem 1. Before stating the theorem, we recall 
the definition that a totally negative quadratic integer is a negative integer 


Presented to the Society, September 5, 1941; received by the editors April 29, 1941. | 
1 


BOSTON UNIVERSITY 


| 
| 
| 
| 
| 
| 
| 
i 


2 IVAN NIVEN [July 


whose conjugate is also negative. Thus —5—2"/? is totally negative, whereas 
—5—4 (21/2) is negative but not totally negative. 


THEOREM 2. Let the coefficients of (2) be integers of a real quadratic field F. 
Then if (2) has one solution in integers of F, it has an infinite number, except 
in the following cases: if (2) represents a point, or a pair of straight lines whose 
coefficients are essentially outside the field F, then it has at most one integral 
solution in F; if (2) represents an ellipse (so that A is negative), and A is totally 
negative, then it has at most a finite number of integral solutions in F. - 


Finally, suppose the coefficients of (2) are from an imaginary quadratic 
field. Our result is much the same, but there are interesting differences. 


THEOREM 3. Let the coefficients of (2) be integers of an imaginary quadratic 
field F. Then one solution of (2) in integers implies an infinite number of such 
solutions, with the following exceptions: if the left side of (2) factors into two lin- 
ear expressions in x and y, with coefficients essentially outside F, then (2) has at 
most one integral solution in F; if A 0 is the square of an integer of F, and the 
left side of (2) is not factorable into linear expressions in x and y, then (2) has 
at most a finite number of integral solutions in F. 


In proving Theorems 2 and 3, we use the Pell equation in quadratic fields, 
(3) — yn? = 1. 


In this connection, we prove the following theorem. 


THEOREM 4. Let y be an integer, not zero, of a quadratic field F. Then equa- 
tion (3) has an infinite number of integral solutions (£, n) in the field F if and 
only if y 1s not the square of an integer of F when F is imaginary, and ¥ is not 
totally negative when F is real. 


That the conditions of this theorem are sufficient to insure an infinite 
number of solutions of (3), is proved in the next two sections. The necessity 
of the conditions follows from Theorems 2 and 3, as we shall see at the end 
of §3. 

Theorems 1, 2, and 3 are sufficiently similar that the principal results can 
be proved by a common method; this is presented in §4. Then the theorems 
are completed in the last three sections. The methods employed throughout 
the paper are elementary. 

2. The Pell equation in quadratic fields. If 7 is a positive rational integer, 
not a square, it is well known that equation (3) has an infinite number of 
solutions in rational integers. We can obtain a similar result for quadratic 
fields from a classical theorem on the units of relatively cyclic fields. 

Let 7 now be an integer of a quadratic field F. If 7 is not a perfect square 
in F, and not a rational integer, then the biquadratic field K = R(y"/”) is rela- 


1942] QUADRATIC DIOPHANTINE EQUATIONS 3 


tively cyclic of prime order two over F. It is known(') that there exists a rela- 
tive unit of norm 1 in the field K over F, provided that among the four 
conjugate fields determined by K there are twice as many real fields as there 
are among the two conjugate fields determined by F. This condition is satis- 
fied when F is an imaginary field, because the conjugate of F, being identical 
with F, is also imaginary; consequently, equation (3) has a non-trivial in- 
tegral solution (that is, a solution with 70) in F provided ¥ is not a perfect 
square in F. 

On the other hand, if F is a real quadratic field, then it is again identical 
with its conjugate, and we require K and its conjugates to be real. Now K and 
its conjugates are identical in pairs, each field being either R(y'/*) or R(¥"/?), 
where ¥ is the conjugate of y in F. These are real provided that y and ¥ are 
positive, or in other words, provided that ¥ is totally positive. Hence we can 
conclude that if y is a totally positive integer of a real quadratic field F, then 
equation (3) has a non-trivial integral solution in F. 

Having one solution of (3), we can obtain more by means of the composi- 
tion formula 


- ms) = (f:f2 + — + tem)» 


This provides an infinitude of different solutions. For example, a non-trivial 
solution compounds with itself to give a different non-trivial solution. We 
have proved this lemma. 


LemMaA 1. Let y be an integer, not a square, of any quadratic field F. Let y 
be totally positive if Fis real. Then equation (3) has an infinite number of integral 
solutions in F. 


3. Real quadratic fields. Lemma 1 is not the best possible result for real 
quadratic fields. We now prove: 


Lema 2. If y is a positive, but not totally positive, integer of a real quadratic 
field F, then equation (3) has an infinite number of integral solutions in F. 


We prove this by a method analogous to that of Dirichlet(?) for rational 
and Gaussian integers. Let F be obtained by extending the rational numbers 
by m/?, m being positive, square-free, and greater than 1. Then y has the 
form a+bm/?, where a—bm/? is negative. For convenience, let 5 denote 
the positive square root of y. For any positive rational integer m, we let v 
range over the values 1, 2, - - - , +1. Let u be the greatest integer less than 
vm", that is, u = [vm‘/?], and we have 


(4) Cf. David Hilbert, Die Theorie der algebraischen Zahlkorper, Jahresbericht der Deutschen 
Mathematiker-Vereinigung, vol. 4, p. 275 (Theorem 92) and p. 279. 
(*?) Cf. Dickson, History of the Theory of Numbers, vol. 2, p. 373. 


{ 
| 
i 
i, 
| 
4 
_ 1 


4 IVAN NIVEN 
(4) | — vm'/2| <1. 
Choose y and x as follows: 
y = + vm!) = + — + 1. 
These equations imply the inequalities 


5(u + vm'/2) — < 
and 


(5) 0< «+ ym'!? — + vm'/?) 1, 
respectively, and these add to give the result 
(6) 0 < — <1 + 


As v ranges over the integers 1, 2, - - - , m-+1 the expression involved in (5) 
takes values between 0 and 1, at least two of which differ by less than 1/n. 
We subtract these to obtain 


1 
(7) X + — + Vm'?) < 
n 
and the inequalities (4) and (6) imply that the rational integers X, Y, U and 
V satisfy 
(8) |U—Vmi2| <2, |X — Ym'!?| <1 + 
Using the fact that | V| Sm, we can write 


| X + + 6(U + Vm'/?) | 
| X + — + Vm") | + 2| + | 


1 
< — + 25| U — Vm'!2| + 25| 2Vm'/2| 
n 


1 
< — + 
n 


The multiplication of this inequality by (7) gives 

| (X + Ym'/2)? — + Vm'/2)2| < 1 + 26(2 + 2m), 
and we set X+ Ym"/? and U+ Vm"? to obtain 
(9) | — yn?| < 1 + 28(2 + 2m). 


We now show that this inequality is satisfied by an infinitude of pairs 
(&, ). The left side of inequality (7) is not zero, for otherwise 5 would be an 
element of the field F. Then , being the square of an element of F, would be 
totally positive, contrary to hypothesis. Now if the number of pairs of quad- 


1942] QUADRATIC DIOPHANTINE EQUATIONS 5 


ratic integers satisfying (9) were finite, the rational integer could be chosen 
so large that none of these pairs would satisfy (7). Our method would there- 
fore give another pair of values satisfying (7) and (9). 

Having shown that (9) represents an infinite number of inequalities, we 
now show that £? —~yn? assumes only a finite set of values. We cannot conclude 
this directly from inequality (9), because there are infinitely many integers 
of a real quadratic field which are less in absolute value than a given positive 
quantity. However, there is but a finite number of integers of such a field 
which, together with their conjugates, are bounded in absolute value. Since 7¥ is 
negative, we can use (8) to obtain 


— = (X — — — < (1 + — 7(4). 


Hence the infinite set of quadratic integers £? — yn? of inequality (9) ranges 
over a finite set of values. At least one of these values, say p, is equal to 
§?—-yn? for an infinite number of pairs 


(10) (&1, m), (&, m2),***. 
We now show that it is possible to select from (10) an infinite subsequence 


such that 


bi; 0 (mod p)s 
12 =1,2,3,---), 
3 ni; — i, = O (mod p). G ) 


Let the quantities £:, ,--- of (10) be written as 
(13) Xi + Yim"!2, + , 


the X; and Y; being rational integers. Let N(p) denote the norm of p. Since 
each X; (¢=1, 2, - - - ) is congruent to some term of the complete residue sys- 
tem 0, 1, 2,+--, N(p)—1, modulo N(p), it follows that an infinite number 
of these are congruent to one particular term of this residue system. Thus 
from (13) we have selected an infinite subsequence, and from the latter we 
can select another so that the Y; are congruent to one another modulo N(p). 
We continue this process of selecting subsequences with the terms 7; of (10), 
and obtain finally a sequence (11) such that congruences analogous to (12) 
hold modulo N(p), and these imply (12). 

We now select two different pairs from (11), say (£,, -) and (£,, 7); let 
these be independent in the sense that one pair is not the negative of the other 
pair. They satisfy the relations 


and we multiply these equations to get 


6 IVAN NIVEN (July 


(15) (EE nm)? (Ene = p*. 


But the congruences (12) imply that the integer £,n,—£&,7, is divisible by p. 
Consequently £,£,—y7-%, is divisible by p, and we obtain a solution of (3) by 
dividing (15) by p*. The solution thus obtained is not trivial, that is, 
£.n.—&., #0. For otherwise we could write £,=k£, and with +1, 
and these relationships contradict (14). Noting that an infinite number of 
solutions of (3) can now be obtained by the method set forth at the end of §2, 
we have completed the proof of the lemma. 


LemMaA 3. Let y be a negative, but not totally negative, integer of a real quad- 
ratic field F. Then equation (3) has infinitely many integral solutions in F. 


This is a direct consequence of Lemma 2. For, by hypothesis the integer 7 
is positive. Hence there are infinitely many solutions of 


— = 1. 


The conjugates of these solutions are solutions of (3), and the lemma is 
proved. 


LemMA 4. Suppose that y =a?#0, where a is an integer of a real quadratic 
field F. Then equation (3) has an infinite number of integral solutions in F. 


As in Lemma 2, we take F to be R(m"/?), When a is multiplied by its 
conjugate &, the result is a rational integer, the norm of a, say n. Now there 
are infinitely many pairs of rational integers satisfying 


u? — mn*y? = 1, 


since mn? is not a square. Taking.£=u and »=m'/?av, we obtain infinitely 
many solutions of (3). 

Lemmas 1, 2, 3, and 4 give all cases of Pell equations (3) in quadratic 
fields with an infinite number of solutions, for it is a consequence of Theorems 
2 and 3 that equation (3) can have but a finite number of integral solutions 
for values of y other than those stated in the above lemmas. Thus, upon 
proving these theorems, we shall have Theorem 4 as a consequence. 

4. The general theory. We return our attention to equation (2), the coeffi- 
cient field F being the rational numbers or some quadratic field. Solving for x 
we get 


1 
(16) {—by—d (Ay + By 


where B = 2bd —4ae and C=d?—4af. 
Case 1. B?—4AC0; A is positive and not a square when F is the field of 
rational numbers; A is neither zero nor totally negative when F is a real quad- 


. 


1942] QUADRATIC DIOPHANTINE EQUATIONS 7 


ratic field; A is not the square of an integer of F when F is an imaginary 
quadratic field. With these hypotheses, we show that one integral solution 
of (2) implies an infinite number. Let there be such a solution (xo, yo). Then 
there exists an integer /) such that the equation 


(17) Ay? + By+C 


is satisfied by the values fo, yo. We substitute these values in (17), and sub- 
tract the result from (17), to get an equation which can be written in the form 


(18) (t — to)(t + to.) = (y — yo)(Ay + Ayo + B). 
We look for integral solutions of this equation. We write 
(19) — Yo) = 2ag(t + to), 2ag(Ay + Ayo + B) = p(t — to), 


where p and gq will be specified later. Eliminating ¢ from these equations, we 
get 


(20) (p? — 40*Aq*)y = p*yo + yo + B) + 4pqato. 


By the hypotheses of the case under discussion, and by the lemmas of the 
last two sections, we can choose the integers p and q in infinitely many ways 


go that 


(21) p? — 40%Ag? = 1. 


Thus we obtain integral values for y in (20). These, in turn, give integral val- 
ues for ¢ in (19), as can be seen by eliminating y from these equations. 

We now make certain that these values of y give integral values of x in 
(16). Multiplying the first equation in (19) by , and eliminating p? by the 
use of (21), we see that 


— yo) +.40°A(y — yo) = 2apg(t + 


Hence y =o (mod 2a), and the same argument applied to the second equation 
in (19) shows that t=) (mod 2a). These imply the congruence 


—by—d+t+t= — by — d + t& (mod 2a). 


Since yo and to give the integral value x in (16), this congruence shows that 
our method gives integral values for x, provided that the sign is chosen prop- 
erly. 

Finally we must demonstrate that the above procedure gives an infinitude 
of solutions of (16). Using (21) to eliminate ? from (20), we have 


(22) ¥ = yo + 4aq{0g(2A yo + B) + pto}. 


First, suppose that to=0. Then 2Ayo+B 0, for otherwise we could write 
yo= —B/2A, and these values of tp and yo, when substituted in (17), give 
B*—4AC=0, contrary to hypothesis. Also a0, so that the coefficient of g? 


8 IVAN NIVEN [July 


in (22) is not zero. Consequently, each different value of g* gives a different 
value of y. 

In the second place, if t9#0, we show that of all the values satisfying (21), 
only a finite number give y = yo in (22). Values of p and q giving y = yo satisfy 
aq(2Ayot+B)+pto=0, and the result of eliminating p from (21) by means of 
this equation is 


q {(2aA-yo + aB) — 4a Ato} = 


This is satisfied by not more than two values of g. 

Suppose now that equation (22) gives only a finite set of values, say 
Yo. ¥1,°**, Yr. We select a rational prime + which does not divide any of 
Yor Yo, » Ve Let (P, Q) be such a solution of 


Pt — = 1 


that the corresponding solutions p= P, g=7Q of (21) do not give y=~ypo in 
(22). Then the value y thus obtained from (22), having the property that 
y —yo is divisible by 7, is different from y:, ys, - - + , yr. We have shown, there- 
fore, that (22) gives an infinite set of different values. 

Case 2. A=0, B*—4AC#0, so that B¥0. Again we assume one integral 
solution (xo, yo) of (16), and show that it can be used to generate an infinite 
number. Proceeding as we did in the first case, we get the following equation 
analogous to (18) 


B(y — yo) = (t — to)(t + to). 


We write 


¥— Yo = 2ag(t+ to), — to = 2agB, 


where g is any integer of F. Eliminating ¢ from these equations, we have. 
y = 4a°g?B + 4agto + yo. 


The coefficient of g? is not zero, and hence this formula gives an infinitude of 
integral values of y. As in Case 1, we have y= yo and ¢ =¢y (mod 2a), so that 
the values of y give integral values of x in (16). 

Case 3, B?—4AC=0; neither A nor C is negative when F is a oll field. 
In other words, we are now treating the case where the left side of equation 
(2) factors into two linear expressions, both being real when F is real. Equation 
(16) can be written in the form 


+ by +d = + + C12), 


If both A‘? and C’? are in F, then these linear equations have integral 
coefficients from F. As was remarked at the beginning of §1, one integral 
solution implies an infinite number. 


1942) QUADRATIC DIOPHANTINE EQUATIONS 9 


If A/* is in F, but C’/? is not, then obviously there is no integral solution 
in F, for such a solution would enable us to write C'/? as an element of F. 

If C’/? isin F, but A‘? is not, then any solution (xo, yo) must have yo =0. 
Also, since B =2A/2C/2, and B is in F, it follows that C=0. Hence the only 
possible solution is yo =0, xo= —d/2a. 

If neither A'/? nor C!/? isin F, any integral solution (xo, yo) must be such 
that A1/2y9+C/2=0, which fixes the value of yo; and x» must therefore satisfy 
2ax-+byo+d=0, so that there cannot be more than one solution. 

5. The rational case. We now prove Theorem 1; the coefficients of (2) are 
taken to be rational integers. The case in which (2) represents a pair of 
straight lines was treated in Case 3 in the last section. If (2) represents a 
parabola, then A =0 and B0. This was discussed in Case 2 in the last sec- 
tion. Hence we can complete the proof of Theorem 1 by treating the hyper- 
bola. We prove this lemma. 


Lemma 5. Let (2) represent an hyperbola, so that A>0 and B*—4AC#O0. 
Then the asymptotes are essentially rational if and only if A is a perfect square. 


First, if the asymptotes are rational, we can write (2) in the form 
(23) a(x + ay + Bi)(x + ary + Bs) = 4, 


where a, a, 8:1, and ® are rational, and the asymptotes are obtained by 
equating to zero the expressions in parentheses. Equating coefficients in (2) 
and (23), we obtain 


b = a(a; + az), C = 
Hence we can write 
A = — 4ac = a*(a; + a2)? — = a2(a1 — 


The integer A is the square of a rational number, and consequently is the 
square of a rational integer. 

Conversely, suppose that A =k? 0. In order to show that the asymptotes 
are rational, we exhibit them. Multiplying (2) by 4a, we have 


(2ax + by)? — k*y* + 4adx + 4aey + 4af = 0. 
Multiplying by k?, and completing the squares, we obtain 
(24) (2akx + bky + dk)*® — (k*y — 2ae + bd)? = T, 
where T is given by 
— 4afk? — (2ae — bd)?. 


The asymptotes of the hyperbola are obtained by factoring the difference of 
two squares on the left of (24), and equating the factors to zero. It is obvious 
that they are rational lines, and this completes the proof of the lemma. 


10° IVAN NIVEN [July 


We now consider Case 1 of §4 in the light of Lemma 5, and see that we 
have proved that an hyperbola (2), with irrational asymptotes, has either no 
integral solutions or an infinite number. To complete the proof of Theorem 1, 
we must show that an hyperbola (2), with rational asymptotes, cannot have 
an infinite number of integral solutions. 

Let (xo, yo) be a point with integral coordinates lying on the hyperbola. 
Let equation (1), with rational integral coefficients, denote an asymptote. 
Then the distance from the point on the curve to the asymptote is 


lxo + myo + 1 
(? + m*) 1/2 (7? + m*) 1/2 


since lxo+-myo+n is a nonzero integer. But the asymptotes approach the 
curve, so that the points of the hyperbola whose distances from the adjacent 
asymptote are greater than any given positive quantity, must lie in a finite 
region of the plane. Consequently, only a finite number of points with integral 
coordinates lie on the hyperbola. 

6. The proof of Theorem 2. Let the coefficients of (2) be integers of a real 
quadratic field. If (2) represents a point, then obviously it cannot have more 
than one integral solution. The situation in which (2) represents a pair of 
straight lines was treated in Case 3 of §4; a parabola in Case 2; an hyperbola, 
or an ellipse with A not totally negative in Case 1. All that remains is the 
last statement of Theorem 2, concerning the ellipse with A totally negative; 
we turn to this now. 

Multiplying equation (2) by 4a, we get 


(2ax + by)? — Ay? + 4adx + 4aey + 4af = 0 
We multiply by —A, and complete the squares to arrive at 
(26) — AX? + Y? = (bd — 2ae)? — A(d? — 4af), 


where 
(27) X = 2ax + by +d, Y = Ay + bd — 2ae. 


Suppose that the quadratic field with which we are dealing is R(m*/*), where 
m is positive. Then the integer A, being totally negative, has the form 


(28) — p — qm, | | > 0, 


where and g are rational integers, or perhaps the halves of odd rational in- 
tegers in case m=1 (mod 4). We are looking for integral values of x and y 
in R(m"/?), so we suppose that X =w+im'/?, and Y=u-+vm"/?. Let the right 
side of equation (26) be r-+sm'/?, The quantities w, t, u, v, r, and s are rational 
integers (or perhaps the halves of odd rational integers). 

Substituting these values in (26), and equating the rational mm) we have 
the result 


1942] QUADRATIC DIOPHANTINE EQUATIONS 


(29) p(w? + tm) + 2qmwt + u? + = 
The inequality in (28) enables us to write 
p(w? + fm) 2| - | wim!2| = | 2qmut |, 


so that r must not be negative if (29) is to have any solutions. Equation (29) 
implies that 


u? pw? + pmi? + 2qmwui S 


Clearly the first of these inequalities has only a finite number of solutions in 
integers (or halves of odd integers) u and v. The same is true of the second 
inequality in w and t, because the discriminant of the left side is 


4q2m? — 4mp* < — 4m(mq*) = 0, 


by (28). Hence the number of integral solutions in X and Y of (26) is finite, 
and, by (27), the number of integral solutions of (2) is.finite. 

7. Imaginary quadratic fields. We now prove Theorem 3. The situation in 
which B?—4AC=0, that is, in which the left side of (2) factors, was treated 
in Case 3 of §4. When B?—4AC+0, Cases 1 and 2 handle the situations 
with A not a perfect square, and A zero, respectively. All that remains to be 
proved, therefore, is that (2) cannot have an infinite number of solutions 
when B?—4AC+0 and A is a perfect square in F, say k*?. We can proceed 
as in §5, and obtain equations (24) and (25); T must be different from zero, 


since otherwise the left side of (2) would be factorable into two linear factors, 
contrary to hypothesis. We use the substitution 


X = 2akx + bky + dk, Y = hy — 2ae + bd, 
to write (24) in the form X?— Y?=T, from which we get 
(30) 


As in the last section, we show that there is only a finite number of solutions 
in X and Y, and this implies the result we want. Now the positive rational 
integer | T| can be factored into a pair of positive rational integers in but a 
finite number of ways. Any integral solution of (30) must correspond to one 
of these factorings. For any such factoring, say |7|=rs, we can write 
|X—Y| =r and |X+Y| =s, or vice versa. But there is only a finite number 
of integers of any imaginary quadratic field with absolute value equal to a 
given rational integer. Hence we have only a finite number of pairs (X — Y, 
X+Y) satisfying (30), and each pair gives at most one integral solution 


(X, ¥). 


UNIVERSITY OF ILLINOIs, 
Ursana, ILL. 


ON THE PARTIAL SUMS OF HARMONIC DEVELOPMENTS 
AND OF POWER SERIES 


BY 
OTTO SZASZ 


1. Introduction. Consider the class E of power series f(z) =) 9 c,2”, con- 
vergent for |z| <1 and such that |f(s)| $1. The following result is due to 
I. Schur and G. Szegé [5](#). 

For any series of the class E, 


| sn(z) | =| Doe 
0 


in | s| Sr,, but not always in | z| <r,+e, €>0, where r, is the largest r for 
which 


1 
Ta(r, 0) = > + > 1’ cos = 0 for all 0. 
1 


The r, are non-decreasing, 
log 2n 
>t- m=1,2,3,---, 
n 


' log 2n — log log 2n + «, 


lim = 0. 
nN ne 


We obtain the same constant r, if we assume Rf(z) 20 and require Rs,(z) 20. 
Here Ru means the real part of u; Iu will denote the imaginary part. 
In what follows, we consider harmonic sine developments 


H(r, 0) = sin v8, 
1 


convergent for 0<r <1, and non-negative for 0<@<7. Evidently there exists 
an R, with the following properties: 
(a) Whenever 


(1.1) H(r, 6) 2 0, 0<r<1;0<0<z, 
then, 
(1.2) sa(r, 0) = > by’ sin 02> 0, O< r SR, 0<0< 
1 
Presented to the Society, January 1, 1941; received by the editors March 10, 1941, and in 


revised form, May 7, 1941. The author is indebted to the referee for valuable suggestions. 
(?) Numbers in brackets refer to the literature at the end of this paper. 


12 


HARMONIC DEVELOPMENTS 13 


(b) For any e>0 we can find an H satisfying (1.1) and such that s,(r, 0) 
becomes negative for some @ and some r<R,+€. 

We denote the class of harmonic functions satisfying (1.1) by T. On writ- 
ing f(s)=)_ 1,2”, the power series f(z) is regular in | z| <1, has all its coeffi- 
cients real, and [f(z) 20 in |s| <1, Iz>0. The class T has been discussed by 
Rogosinski [4]; the function f(z) is called typically real. (Cf. also S. Mandel- 
brojt [2].) 

One of the results of the present paper is 


31 log | 
nN nN 


M. S. Robertson [3] gave the erroneous estimate 
R, 2 1— 2 log n/n for > 12. 


His calculation yields however, as is seen easily, R,21—4 log n/n, for 
n>n»(*). We then apply the properties of R, to Fourier series of convex 
functions and to a certain class of power series. 


Note that if ¢(0)~>_fb, sin v0, 20, 0<@<z, then 
2 
H(r, 6) = #(2)( > sin v sin de 
0 1 


= = r’ [cos »(@ — x) — cos + x) dx 


1—2rcos(@—x)+r? 1— 2rcos(6+ x) +7? 

[1 — 2r cos (6 — x) + r?][1 — 2r cos (6 + x) + 


Hence H(r,,0) belongs to the class T. (Cf. Zygmund [8, p. 57].) 
2. Characterization of R,. We quote the following lemma, due to Fejér 
(Turan [7]). 


Lemma 1. In order that 


sin vx sinvy 20 


vel 


itis necessary and sufficient that 


> vA, sin v6 = 0 


1 


(2) Robertson, Annals of Mathematics, (2), vol. 42 (1941), pp. 829-838. 


| 

I 


14 OTTO SZASZ [July 


We now prove 


THEOREM 1. The quantity R, as defined in §1 is the largest r for which 
(2.1) S,(r, 0) = > wr’ sin = 0 for 
1 


We have for 0<p<1, 


2 
p’b, = H(p, x) sin vx dx, y=1,2,3,+++; 


S,(r, 0) = =f" H(p, 2)( >> (=) sin v@ sin rs) dx. 


For any r < R, we can choose p <1 so that r/p < R,; we then obtain by Lemma 
1 (for s,(r, 0) 20 for any r<R, and for 0<@<z7; hence (a) holds for 
r<R,. Conversely, for the function 


>0 
(1 — 27 cos 6 + r?)? 


H(r, 0) = >> sin v6 = sin 0 
1 
the function ‘ 
Sa(r, 0) = >> sin vO 
1 


becomes negative for any r>R, and for some @ in (0, 7). This proves Theorem 
1. To estimate R, we first give another characterization for it. An easy cal- 
culation yields 

(1 — 2r cos + 


> sin v0 
r sin 6 1 


sin (m — 1) sin "6 


9 

+ + 2 + nr’) 

sin 0 sin 6 

sin (m + 1)0 sin + 2)0 
in 6 sin 0 


=1—r2— (n+ 1)r"*?. 


— + 1 + 2nr’) 
= C,(r, 6). 
This furnishes 
THEOREM 2. R,, is the largest r for which 
C,(r, 0) 2 0 
Evidently 


hence 


1942] HARMONIC DEVELOPMENTS 
Ca(r, 7) = 1 — + — + "(2n + 2 + nr*)(— 
+ (nm + 1)r"(m + 1 + 2mr?)(— + n(n + 1)" 
(n? — + mr™*"(2n + 2 + nr’) 


+ (nm + + 1 + 2nr?) + n(m + 
Thus 


0) 21 — — — + + 2 + nr’) 
+ (m + 1)r(n + 1 + 2nr*) + + 2)r4}, n 
and equality holds if n=2k, and 9=7. This yields 

THEOREM 3. Denote the unique positive root of the equation 
= 1 — — + — n(3n + — (3n? + — — =0 
by pn. Then R= pn, and equality holds for n=2k, k21. 

Note that p,(0) =1, p,(1) <0, p,/ (r) <0. Hence p, is unique and 
(2.2) 0<p, <1. 

Evidently »,(—1)=0, hence 1+7 can be factored out, and we get 
(2.3) =1—r— (n+ — + 2m — 1)r**! — & g,(r), 
so that gn(pn) =0. | 

3. Estimation of p, and R,. Direct calculation gives 

Ri = 1; pi: = 0.182---. 
Also p2= Re, and 
S.(r, 6) = r sin 6 + 2r? sin 26 = r sin 0(1 + 4r cos 8), 


which yields by Theorem 1: R2=1/4=p2. A similar calculation yields 
Rs=2"/2/3. 
We shall prove 
3 log n + log log m + log 3/4 + €, 


(3.1) > O0asn— ~, 
n n 


Let c bea constant, and 


3logn loglogn+c 
(3.2) r(c) =1— + ; 
n nN 


then from 


log (1 — x) = — x + O(x*) 


as 


16 


we conclude 


{ra(c)}" = exp {— 3 log m + log log m + c + O(n-' log? n)} 


3.3 
(3-3) = n-* log n-e°{1 + O(n-! log? m)} as n> @. 


Furthermore, from (2.3), (3.2), and (3.3) 


3logn loglogn+c  4logn 
n 


Gn{ra(c)} = -e°{1 + o(1)}, 


nN 


} 
log n 


—+3— 


Thus for 
c = log 3/4 + 


€ a given small number, and for sufficiently large values of n 


sgn gn{tn(c)} = sgn ¢, 


from which follows (3.1). 
We have thus proved ~ 


THEOREM 4. If p,>0 and p,(p,) =0, then 


nN nN 


’ where ¢,— 0 as n— &, 


Pan = 


4. Derivation of an asymptotic estimate for R,. On writing 


3 log n log log n + 6, 
g + g log + 
n n 


it follows from Theorem 3 that 
8, 2 log 3/4 + &,, 
and equality holds for n=2k, k2=1; hence from Theorem 4 
lim inf 6, = log 3/4, lim 59, = log 3/4. 
wich 
It remains to give an estimate for Re; from above. 


If for a particular value of 0 and 7, Cx_1(r, 8) <0, then by Theorem 2, 
evidently Rer-1<r. We now choose 0 =x — (34/4k); then 


|_| OTTO SZASZ [July 
hence 
as n—> ©, 
3 


1942] HARMONIC DEVELOPMENTS 


3(k — 
2k 
3(2k — 1)x 
4k 


1 
= — k 2k+1 
Cor-r(r, 0) = 1 — + sin sin 


+ r*[4k + (2k — 1)r*] sin 
3x 
+ r2*-1[2k + 2(2k — 1)r?] sin 


+ (2b — sin 


4k 
1 


3 
=1-7r? — cos — 
sin (32/4k) 2k 


+ r2*[4k + (2k — 1)r?] E =| 


3 
+ + (4b — 2)r2*+ 4 (2k — 1)r2* cos =\ 


4k 
<1-f- (1 


us 


+ r2*[4k + (2k — 1)r?] (1 


+ + (4k — + (28 k2>3 
(since cos x >1—<x?/2 for all x). Hence 
0) < 1 — — (2k/5){ + + — 4 

+ (4k — 2)r2*+1 + — 1/2)r2*}, k25, 
thus 

Cox-i(r, 0) < 1 — 7? — (2k/5)(11k — 3)r2*+2 < 1 — 7? — 
Choosing r so that 
(4.1) 1 — 2 — 42h? < 0, 
we get 
Coz-1(r, 0) < 0, < 1. 

To find an upper bound for 7, we put 
Slog (2k—1) loglog (2k — 


4.2 r=1 
2k—1 2k—1 


- 
8k? 
- 
32k? 


18 OTTO SZASZ | {July 


we obtain as in (3.3) 
r2k-1 = exp {— 3 log (2k — 1) + log log (2k — 1) + c + O(k™ log? k)} 
= (2k — 1)-* log (2k — 1)-e*{1 + O(k~ log? &)}. 


Thus, using (4.2), 
( 2k r? r2k-l(2k 1)? 
1+o0(1) 1 1 
= ——_ . —e{1 + o(1)} ko, 
2 +0(1) + o(1)} +—e as k—> 


Hence (4.1) is satisfied for all sufficiently large k provided e*/6>1, that is, 
c>log 6. It now follows that lim 52:-1 36. Summarizing we have 


THEOREM 5. Let 


3 lo log 1 bn 
n>1; 
nN n 


then lim +00 = log 3/4, and 
log 3/4 lim inf s lim sup 6. 


5. Application to Fourier series. Consider the roof-function 


60 
for 0560S 4, 
sin va sin vO a 


r— 6 
b for 


T—a 


= — a) 


where 0 <a <7, and the corresponding harmonic function 


- sin va sin v6 
r” ————— = H(r; a, b). 
—a) 1 


Denote its partial sums by 
2b va sin v0 


H,(r, 0) = yr 


a(r— a) i y? 


0°H,(r, 6 2b 


for 0<rS Rn, 0<0<2, by Lemma 1 and Theorem 1. Hence H,(r, 8) is con- 


| 

then 


1942] HARMONIC DEVELOPMENTS 19 


vex upwards for 0<@<z, rSR,; but not convex for r>R,. The same is 
true for the limiting cases a—0 and a—7. In which cases 


0, 


v 


Hr; 8) = sin — 6) 
v 
Moreover every polygon convex upwards and lying above the axis of ab- 
scissae is expressible as a finite sum with positive coefficients of roof-functions. 
Hence the partial sums of the corresponding harmonic development are con- 
vex upwards for rSR,. Finally any function positive in 0<@<7, and convex 
upwards can be approximated uniformly by such polygons; hence we have 


THEOREM 6. Jf f(0)>0 in 0<0<x, and is convex upwards, and if 
f(0)~D Pb, sin v0, then sin vO is convex upwards in 0<O0<m, 
but not always for r<R,+€, e>0. 


6. Cosine series. We now consider the cosine series of the step function 


1 


2 v 


6b for 
where 0<a<z, b>0; and the corresponding harmonic development 


b 26 = sin va cos v0 
K(r, 9) 
1 


v 
For the partial sums K,,(r, 0) of this series we have 
0K,(r, 0) 26 


== Dr sin va sin 2 0 for Ry, 7; 


hence K,(r, 6) is monotonic increasing in the same domain; R, cannot be re- 
placed by R,+¢€, €>0. The same statement for any monotonic increasing 
function follows now in an obvious way. Hence we have 


THEOREM 7. I f is monotonic in 0 <0 <7, and 
~ ao/2 + > a, cos 
1 


then the nth partial sum of ao/2+)_fa,r’ cos vO is monotonic in the same sense 
for 0<rSR,, and here R,, cannot be replaced by R,+€, €>0. 


7, Curves convex in direction of the v-axis. We say that a curve in the 
(u, v)-plane is convex in the direction of the v-axis if any parallel to the v-axis 


20 OTTO SZASZ i [July 


has at most two points in common with the curve. This class of mappings 
was considered by L. Fejér [1] and the author [6]. We now prove 


THEOREM 8. Suppose the power series }\s°a,2” =f(z)=w=u-+iv is regular 
in | z| <1, and all a, are real. Suppose further that the images K, of the circles 
|z| =r, 0<r<1, are convex in the direction of the v-axis (thus f(z) is univalent). 
Then the partial sum > 0,2" has the same property in |z| < Rn, but—in general— 
notin a larger circle. 


For the proof we may assume without loss of generality that the upper 
half of the circle | s| <1 is mapped onto the upper half of the image in the 
w-plane. On writing w(e*) = u(0)+i0(0)~> cos sin vO, we find 
that v(@) is positive for 0<@<~7, and (from the assumption) u(@) is decreasing 
in the same interval. Our theorem follows now from Theorems 5 and 7. 

8. Conclusion. Suppose f(z) =)_7°b,2’ is a typically real function, that is, 


Dor’ sinv0 > 0 for 
1 
Then the Riesz means of second order 
P,(z) = (n + >> (m — v + 1)%,2”, n2=1, 


are typically real in | z| 1 (Sz4sz [6]; cf. Theorem 1). Evidently lim,.. P,(2) 


= f(z) in | <1, uniformly in | Sr,r<1. Another such sequence of polyno- 
mials is 


= > , n2 i. 


These polynomials are typical real in |z| $1 by property (a) of §1. Further- 
more for |z| 


| f(z) — ra - + 


<(1-R,) 0, ano, 
nel 


Hence 
lim s,(Rz) = f(z) 
uniformly in |2| <7 <1. 


REFERENCES 


1. L. Fejér, Neue Eigenschaften der Mittelwerte bei den Fourierreihen, Journal of the London 
Mathematical Society, vol. 8 (1933), pp. 53-62. 


n 

i 


1942] | HARMONIC DEVELOPMENTS 21 


2. S. Mandelbrojt, Quelques remarques sur les fonctions univalentes, Bulletin des Sciences 
Mathématiques, (2), vol. 58 (first part) (1934), pp. 185-200. 

3. M.S. Robertson, On the theory of univalent functions, Annals of Mathematics, (2), vol. 37 
(1936), pp. 374-408. 

4. W. Rogosinski, Uber positive harmonische Entwicklungen und typischreelle Potensreihen, 
Mathematische Zeitschrift, vol. 35 (1932), pp. 93-121. 

5. I. Schur and G. Szegé, Uber die Abschnitte einer im Einheitskreise beschrinkten Potens- 
rethe, Sitzungsberichte der Preussischen Akademie, 1923, pp. 545-560. 

6. O. Szész, On the Cesdro and the Riesz means of Fourier series, Compositio Mathematica, 
vol. 7 (1939), pp. 112-122. : 

7. P. Turan, Uber die montone Konvergenz der Cesdro-Mittel bei Fourier- und Potenzrethen, 
Proceedings of the Cambridge Philosophical Society, vol. 34, Part II (1938), pp. 134-143. 

8. A. Zygmund, Trigonometrical Series, 1935. 


UNIVERSITY OF CINCINNATI, 
CINCINNATI, OHIO 


7 
‘ 


\ 


' HYPERSPACES OF A CONTINUUM 


BY 
J. L. KELLEY 


_ Introduction. Among the topological invariants of a space X certain 
spaces have frequently been found valuable. The space of all continuous 
functions on X and the space of mappings of X into a circle are noteworthy 
examples. It is the purpose of this paper to study two particular invariant 
spaces associated with a compact metric continuum X; namely, 2*, which 
consists of all closed nonvacuous subsets of X, and @(X), which consists of 
closed connected nonvacuous subsets('). The aim of this study is twofold. 
First, we wish to investigate at length the topological properties of the hyper- 
spaces, and, second, to make use of their structure to prove several general 
theorems. 

If X is a compact metric continuum it is known that: 2¥ is Peanian if X 
is Peanian [7], and conversely [8]; 2¥ is always arcwise connected [1]; 2¥ is 
the continuous image of the Cantor star [4]; if X is Peanian, each of 2* and 
(?(X) is contractible in itself [9]; and if X is Peanian, 2* and @(X) are abso- 
lute retracts [10]. 

In §§1-5 of this paper further topological properties are obtained. In par- 
ticular: 2* has vanishing homology groups of dimension greater than 0, both 
hyperspaces have very strong higher local connectivity and connectivity 
properties—including local p-connectedness in the sense of Lefschetz for p>0, 
and, the question of dimension is resolved except for the dimension of @(X) 
when X is non-Peanian. All of the results of the preceding paragraph for 2* 
are shown simultaneously for 2* and (@(X) in the course of the development. 

In §6 a characterization of local separating points in terms of @(X) is 
obtained and a theorem of G. T. Whyburn deduced. In §7 it is shown that 
for a continuous transformation f(X)= Y we may under certain conditions 
find X>CX, with Xo closed and of dimension 0, such that f(Xo)= Y. In §8 
this result is utilized in the study of Knaster continua. In order that X be a 
Knaster continuum it is necessary and sufficient that @(X) contain a unique 
arc between every pair of elements. If there exist Knaster continua of dimen- 
sion greater than 1 then there exist infinite-dimensional Knaster continua. 


Presented to the Society in three parts, the first under the title On the hyperspaces of a 
given space on December 28, 1939; the second under the present title on April 26, 1940; the 
third under the title A theorem on transformations on December 30, 1940; received by the editors 
May 15, 1941. The major part of the material in this paper was contained in the author’s dis- 
sertation, University of Virginia, June 1940. 

(1) For topologization of these spaces and for definitions of terms used in the introduction 
see the text. A bibliography is given at the end of the article. Numbers in square brackets refer 
to the bibliography. 


22 


i 

“4 


HYPERSPACES OF A CONTINUUM 23 


The author wishes to express his gratitude to Professor G. T. Whyburn 
for his help and encouragement in the preparation of this paper. 

1. Preliminaries. Throughout the following, X will denote a compact 
metric continuum. The letters a, b, c will stand for elements of X. For 
a, bEX, pla, b) is the distance from a to 6. Given a collection a;, a;,EX, 
{a;} denotes the subset of X whose elements are the a;. In particular, {a} 
is the subset of X consisting of the one element a. 

The letters A, B, C stand for closed subsets of X. By 2* we mean the 
space of all closed, nonvacuous subsets of X metricized by the Hausdorff 
metric (that is, p\(A, B)=g.l.b. {e} for all « such that ACV.(B) and 
BCV.(A), where V,.(A) is the sum of all open e-spheres about points of A 
[2]). If AG2* then ACX. The closed subspace of 2¥ consisting of subcon- 
tinua of X is @(X). , 

Similarly 22* consists of closed, nonvacuous subsets <4, B, @ of 2*, with 
Hausdorff distance If theneAC2*. 

For A €2* we define $(A) = { }, a;€A. That is, is the subset 
of 2* consisting of all elements {a} of 2¥ where a€A. In particular ¢(X) is 
the set of all sets {a}. We always have ¢(A)C2* and ¢(A)€2™. For any 
A€2%, $(A) is isometric with A. Similarly, for 4E2%*, (4) denotes the 
subset of 22% consisting of elements {A We have $(-4)C2™ and in 
particular (2%) C2™. 

For c4C 2* we define for all A Ge4. For every A, CX. 


Actually ¢ is a continuous mapping of 2% onto 2*. Further: 
1.1. Lema. (a) o is a contraction, (b) oo is a retraction of 2% onto $(2°). 


Proof. First, for is closed. Suppose a;€o(/4), and lim a; =a. 
Choose A;, We can suppose lim A; =A. Since is closed, A 
and a€A Hence a€e(c4). 

Second, suppose p!(a(¢4), ¢(B)) =d. We can choose in one of (4), o(B), 
say in ¢(4), a point a which is at least d distance from every point of o(B). 
Choose A, a€ A Ge/. This set A is then at least d Hausdorff distance from 
every set BEB. Hence p*(¢4, B) 2d and a is shown to be a contraction. 
That @ followed by ¢ leaves every element of $(2*) fixed is clear. 


1.2. Lemma. If 4 is a subcontinuum of 2* and-A-@(X) #0 then oA) is a 
continuum. 


Proof. Choose A (?(X). Suppose ¢(-4) = A1+A:? is a separation, with 
ACA. Then both the subset <4; of <4 consisting of all elements contained 
in A, and the subset A: of all elements intersecting A2 are closed and non- 
vacuous. But -4,+¢42,=c/, a continuum, and A;-A2=0. We then have a con- 
tradiction. 

It is possible to define(*) a real-valued function y(A), continuous on 27, 


(2) See H. Whitney, Regular families of curves, Annals of Mathematics, (2), vol. 34 (1933), 
p. 246. 


. 


24 J. L. KELLEY 


with the properties: 

1.3. If ACB, A#B then p(A) <p(B). 

1.4. u(X)=1, and for anyaEX, u({a})=0. 
For convenience, we shall suppose throughout that u(A) is a certain fixed 
function with these properties. Since 2% is compact we can further state: 


1.5. Lemma. There exists n(e)>0 such that if A, BE2*, ACB and 
u(B)—p(A) <n(e) then B)<e. 


2. Segments in 2*. Let Ao, A1G@2*. A segment from Ao to A; is a con- 
tinuous mapping A, of the interval [0, 1] into 2* which satisfies the two con- 
ditions: 

2.1. (As) =(1—#)u(Ao) 

2.2. If then AyCAw. 


2.3. LEMMA. Given Ao, A1G2*, there exists a segment from Ao to A: if and 
only if Ao A; and every component of A; intersects Ao. 


Proof. First, suppose that A, is a segment from A,» to A1. If Ai=Bo+Bi 
is a separation of A; such that AoC Bo, then the subset of [0, 1] consisting 
of all ¢ such that A,C By and the subset defined by A,-B,+0 are closed, dis- 
joint and they cover [0, 1]. Hence B,=0. 

Second, suppose Ao, A1G@2*, AoCA;, and every component of A; inter- 
sects A». Consider the vollection of all sets -4C2* which have the two prop- 
erties: 

2.4. If BEA then AoC BCA; and every component of B intersects Ao. 

2.5. If Bo, BieeA then either BoCB, or BoD Bi 
The sum of a monotone family of sets ¢4 of this collection is surely a member 
of the collection. Hence there must exist a member <4 which is saturated 
with respect to 2.4 and 2.5. Since the closure of 4, also satisfies 2.4 and 2.5, 
it follows that 4p is closed. 

We now define for ¢, OS#S1, A, to be that element of «/%) if it exists, such 
that u(A.) =(1—t)u(Ao)+tu(A1). By 2.5 we see that A, is 1-1 and continuity 
follows from the continuity of the 4 function. Now the proof will be complete 
if we show that A, is defined for every t, 0S¢31, or—what is the same—that 
for Ay, Ay OSt' <t’’ S1, there exists A Ge4y such that <u(A) 
<y(A,-). Because of the maximal character of ¢e4, it is sufficient to show that 
there exists some A €2* satisfying Ay CA u(Av) <pu(A) <p(Av-) with 
every component of A intersecting A,. Choose then €>0 so that V,(A,) fails 
to contain A,,, and let A consist of the components of A,: V,(A,) which 
intersect Ay. Now some component of A, is not contained in V,(Ay) and 
hence A, is a proper subset of A, while A is surely a proper subset of Ay. 
The required properties follow. 

Since any subarc of a segment is, with proper parametrization, a segment,. 
we have 


[July 
q 
4 


1942] HYPERSPACES OF A CONTINUUM 25 


2.6. Lemma. If AG C(X) then every segment with A as beginning is con- 
tained in ((X). 


The Cantor star is the plane set obtained by joining with a straight line 
every point of a discontinuum D which lies on the x-axis to the point (0, 1) 
on the y-axis. Each point of the star can be identified by a point x€D and a 
coordinate y, OS y3S1. 


The following theorem has been proved by Mazurkiewicz for 2*. (See [4] 
and also [1].) 


2.7. THEOREM. Each of 2* and ((X) is the continuous image of the Cantor 
star, and hence arcwise connected(*). 


_ Proof. We first show that the set 2 of all segments in 2* and the set 2; 
of all segments with beginning in @(X) are compact subsets of { 2*} #, where E 
is the unit interval. Now 2 is an equicontinuous collection of mappings, for. 
for any segment A,, we have | u(A | = t’’| (u(A1) —p(Ao)) 
s — Hence by 1.5, if — then p'(Ay, <e. The relations 
2.1, 2.2 clearly hold for any limit element and hence 2 is compact. That 2; isa 
closed subset of 2 follows from the fact that for a convergent sequence of 
mappings, the limit of the beginning elements is the beginning element of the 
limit. 

Let A ,(x), for xED, be a continuous mapping of the set D onto = (or 2). 


Now A,(x) is continuous simultaneously in x and ¢, and since A1(x) =X for 


any xCD, the mapping f(x, y) = A,(x) is a continuous: mapping of the Cantor 
star onto 2* (or @(X)). 


3. Contractibility(*). We now have the tellowion lemma 


3.1. Lemma. The following properties are equivalent: 
(a) ts contractible in 2*. 

(b) 2* is contractible. 

(c) @(X) is contractible (in itself). 


Proof. The proof is in three steps. First, (a) implies (b). If @(X) is con- 
tractible in 2* there exists a continuous mapping F(a, t) of X XE, where E is 
the unit interval, into 27, such that F(a, 0)= {a}, F(a, 1)=a constant. De- 
fine for AG2*, F(A, t)= { F(a, t)} for a€A. Since F(a, t) is a qaateanee 
mapping of X XE into 2%, 7(A, #) maps continuously 2% XE into 2%. The . 
deformation ¢(7(A, ¢)) is then continuous and contracts 2* in itself. 

Second, (b) implies (c). Suppose 2” is contractible. There exists a mapping 


(*) Actually, in order that a compact metric space X be the continuous image of the Cantor 
star it is necessary and sufficient that there exist an equicontinuous family of mappings of Z 
into X which includes a map of E covering any pair of points. The proof of this proceeds ex- 
actly as that above. 


(*) A space X C Y is contractible in Y if the identity transformation on X is homotopic to 
a constant in Y. 


26 J. L. KELLEY [July 


F(A, t) of 2* XE into 2* such that F(A, 0)=A and F(A, 1)=a constant. 
Since 2* is arcwise connected we can suppose F(A, 1) =X for all AG2*. Let 
F(A, t)={ F(A, t’)} for OS?#’ St. Now 7(A, #) is surely a continuous mapping 
of 2* XE into 2%. The deformation G(A, #) =0(7(A, ¢)) will then be continu- 
ous and will have the properties: G(A, 0) =A; G(A, 1)=X; if <#’’ 31, 
then G(A, t’)CG(A, t’’). Hence for A fixed, G(A, t’), 0St' St defines with 
proper parametrization, a segment from G(A, 0) to G(A, #). Hence by 2.6 
if AC C(X) then G(A, t)EC(X) for every t, OS¢S1. Hence @(X) is contract- 
ible. 

Third, (c) implies (a). This is obvious. 

Remark. It follows from the above arguments that if 2* and @(X) are 
contractible then the deformation G(A, ¢) can be chosen to satisfy 


G(A+B, t)=G(A, t)+G(B, 


If <#’’ S1 then G(A, #t’)CG(A, #”’). 

We shall consider spaces X having the following property: 

3.2. For e>0O there exists 5(€)>0 such that if a, bEX, p(a, b) <8(e) and 
a€AEC(X), then there exists B, bEBEC(X) with p'(A, B)<e. 

As a generalization of a theorem of Wojdyslawski (see [9]) we prove 


3.3 THEOREM. If X has the property of 3.2 then 2* and ((X) are contract- 
able. » 


Proof. In view of 3.1 it is sufficient to show that $(X) is contractible 
in 2*. We define now a mapping of X XE into 2™ as follows: F(a, t) = {A } 
where and w(A)=t. Now for G(a, t)=o0(F(a, t)) we have 
G(a, 0)= {a} and G(a, 1)=X. Hence the proof reduces to showing the con- 
tinuity of F(a, ¢). 

First, for a©X we show uniform continuity in ¢. Suppose 0S$#’ S$#’’ $1. 
Then from 2.3 we see that for each AiG F(a, t’) there exists A2€ F(a, t’’) such 
that A2>Au,, and similarly, given A2€ F(a, t’’) we can find some A1€ F(a, t’) 
with Az >A. Hence if | ¢” —t’ ‘| <n(e) of 1.5 then every element of each of 
F(a, t’) and F(a, t’’) is within ¢ of some element of the other and p?(F(a, t’), 
F(a, t’’)) <e. 

Finally, if ¢ is fixed F(a, ¢) is continuous in a. If a and 6b are near and 
A€F(a, t) then by 3.2 we can choose B near A, bE BE C(X). Now u(B) is 
near u(A). If 4(B)>p(A) we can choose B; on a segment from {b} to B, B, 
near A (see 1.5) with u(B;) =y(A). If u(B)<p(A) we can choose B; on a seg- 
ment from B to X, with u(B:)=y(A). In either case we find B, near A, 
B,EF(b, t), and continuity is demonstrated. 

Examples. Let X be the curve in the xy-plane defined by 


1 
y = sin—, for O0O< 231, 
x 


-isysil, for «= 0. 


1942] HYPERSPACES OF A CONTINUUM 27 


It is easy to verify that condition 3.2 is satisfied for X and hence 2* and @(X) 
are contractible. 
If we add to X the interval 


3 
15735 for x= 0, 


then 3.2 is not satisfied for the curve X; so obtained. Nevertheless, since X; 
can be deformed into X, 2*: and @(X1) are contractible. This shows that con- 
dition 3.2 is sufficient without being necessary. 

If we now add to X; the points 


1 1 
y=—-+sin—, for «450, 
2 x 


we obtain a curve X:2 for which 2*: and @(X2) fail to be contractible. If @(X2) 
were contractible we could suppose the deformation F(A, ¢) satisfied the con- 
dition: If <#’’S1 then F(A, t’)C F(A, t’’). If and a has a posi- 
tive x-coordinate, there will exist tp such that F(a, t))CX and F(a, to) con- 
tains the interval —1Sy31 for x=0. If b€© X: has a negative x-coordinate, 
every continuum containing 5} is at least one-half unit from F(A, to). But @ 
and 6 can be chosen arbitrarily close, and we have a contradiction. 


3.4. THEOREM. The space 2* is acyclic in all dimensions. 


Proof. Suppose Z is a 5-cycle in 2*, that is, an abstract cycle with vertices 
in 2*, with the diameter of every simplex less than or equal to 6. For A G2* 
let F(A) be the set of points in X each of which is at most 5 distance from 
some point of A. Now if p'(A, B) $4, then p'(F(A), F(B)) S54, for every point 
of F(A) is at most 6 distance from some point of A, and this point belongs to 
F(B). Hence if we map each vertex A; of Z into F(A;) we obtain a 6-cycle Z; 
with each vertex at most 5 from the corresponding vertex of Z. But from the 
definition of F it follows that there is an integer m such that the nth iteration 
of F carries every A €2* into X. Hence Z is 36 homologous to a cycle on 
X €2*. The theorem follows. 

Remark. In case X satisfies the condition 3.2 then the preceding theorem 
as well as a similar theorem for @(X) is an obvious consequence of 3.3. 

Problem. Is @(X) always acyclic in all dimensions? 

4. Local connectedness and retraction properties. Before proceeding we 
note two lemmas: 


4.1. Lemma. If X is Peanian then 2* and @(X) are contractible. 
Proof. Any Peano continuum surely has the property of 3.2. 


4.2. Lema. If c/ is a Peanian subset of 2* (or @(X)) then A is contractible 
over a subset B of 2* (or (X)) such that diameter -A =diameter B. 


28 J. L. KELLEY [July 


Proof. If 7(A, ¢) is a function contracting in is a sub- 
set of @(2*)), then o(F(A, contracts in o(@(@4)). Further, @(@/4) has 
the same diameter as ¢4, and @ is a contraction. 

The following local connectivity property implies local p-connectedness in 
the sense of Lefschetz(*) for p>0. 


4.3. THEOREM. Let K be a finite complex, K, a subcomplex including all the 
1-dimensional simplices of K, and f(K:)C2* (or @(X)) a continuous mapping 
such that the partial image of any simplex of K is of diameter less than ¢e. Then 
f may be extended to a mapping of all of K into 2* (or @(X)) so that the diameter 
of the image of any simplex is less than e. 


Proof. First, let f(S")C2* (or @(X)), 21, be a map of the surface of an 
(n+1)-cell E**+!. Then, by 4.2, f may be extended to a map of all of E**! 
into 2* (or @(X)), for the image of S* is a Peano continuum. Now let f(K:) 
be the mapping given in the lemma. Then @f is a map of K; into ¢(2*) C@(2*). 
We can now extend f to all of each 2-simplex x* of K so that x? maps into 
C(f(x?:K1)). Repeating this process, one dimension at a time, we arrive at a 
mapping f of all of K, identical with ¢f on K:, and such that the image of 
any simplex x" is contained in @(f(x*-K1)). Hence the diameter of F(x") equals 
the diameter of f(x"-K1). Since o is a contraction, we see that of is the re- 
quired extension of f. 

We now reprove a theorem of Wojdyslawski (see [10]; also [7] and [8]). 


4.4. THEOREM (Wojdyslawski). The following statements are equivalent: 
(a) X is Peanian. 

(b) 2* is Peanian. 

(b’) @(X) is Peanian. 

(c) 2* is an absolute retract. 

(c’) @(X) ts an absolute retract(*). 


Proof. The proof is contained in the following three assertions: 

First, (a) implies (b) and (b’). Suppose that any two points of X less than 
v(€) apart can be joined by a continuum of diameter less than ¢. Then if A, 
BE2*, p'(A, B)<v(e), the set C consisting of all points which can be joined 
to A by continua of diameter at most ¢ has the properties: p'(A, C) Se, p'(B, 
C) S$ 2e, A+BCC and every component of C intersects both A and B. Hence 
by 2.3 there exist segments A, and B, from A to C and B to C, respectively. 
The continuum 4 = {A,}+{B,}, 0S¢S1, is of diameter less than or equal 


(5) See S, Lefschetz, Topology, American Mathematical Society Colloquium Publications, 
vol. 12, 1930, p. 91. 

(*) Aspace X C Visa retract of Y if there exists a continuous transformation f{( Y) = X where 
f isthe identity on X. The metric separable space X isan absolute retract if it is a retract of every 
metric space in which it can be imbedded. See K. Borsuk, Sur les rétractes, Fundamenta Mathe- 
maticae, vol. 17 (1931), pp. 152-170. 


t 
. 
4 


1942] HYPERSPACES OF A CONTINUUM 29 


to 3e, and 4. C(X) if A and B belong to @(X). Hence 2¥ and @(X) are Pean- 
ian. 
Second, (a) implies (c) and (c’). Combining the result of the previous 
paragraph with that of 4.3 we have: If K is a finite complex ,K» a subcomplex 
including all of the vertices of K, and if f(Ko)C2* (or @(X)) is a mapping 
such that the partial image under f of any simplex of K is of diameter less 
than »(€/6), then f can be extended to a mapping of all of K into 2¥ (or @(X)) 
such that the image of any simplex of K is of diameter at most ¢. This result, 
by a characterization of Lefschetz(’), implies that 2* and @(X) are absolute 
retracts. 

Third, either one of (b) or (b’) implies (a). If a, 5X, and g(a) and ¢(d) 
can be joined in 2* by a continuum ¢/4 of diameter d, then by 1.2 ¢(@4) isa 
continuum in X about a+5 of diameter at most d. 


4.5. THEOREM. Let Y be a compact, locally connected subset of a metric 
space Z, and let f( Y) be a continuous mapping of Y into 2* (or @(X)). Then f 
can be extended to a continuous mapping of all Z into 2* (or @(X)). 


Proof. The set f( Y) is locally connected, and since each hyperspace is arc- 
wise connected, we can find a Peano continuum ¢/4, f(Y)Ce4, A in 2* or 
@(X), respectively. Since @(4) is an absolute retract we can extend(*) the 
transformation ¢f of Y to a mapping f of Z into @(¢4). The mapping of is 
then the required extension of f. 

Remark. Consider any closed subset «4 of 2* having the property: If 
A €eA/ and if BDA and every component of B intersects A then BEc/. All 
the results of §§2, 3, 4 for 2* (except 3.4) can be shown by precisely the same 
reasoning to hold for such a set ¢4. In particular the space @,,(X) consisting 
of all closed subsets of X having at most » components, and the space @*(X) 
consisting of all closed sets of diameter greater than or equal to d have these 
stated properties of 2*. 

5. Dimension of hyperspaces. Further topological properties are now ob- 
tained. 


5.1. The space 2* always contains the homeomorph of the “fundamental 
cube.” 


Proof. Choose A;G((X), a sequence of nondegenerate disjoint continua 
tending to a point a@A; for any 7. Now each 24+ contains a nondegener- 
ate arc B; and 2* contains topologically the infinite cartesian product 
BiXB:X The theorem follows. 

If X is Peanian and A €((X) then the order of A in X is the smallest 
integer m such that there exists within any V,(A) a neighborhood of A with 


(7) Annals of Mathematics, (2), vol. 35 (1934), pp. 118-129. 
(*) This is a property of absolute retracts. See Footnote 6. 


‘ 
. 
‘ 


30 J. L. KELLEY ' [July 


boundary consisting of at most m points. If no such integer exists then A is 
said to be of non-finite order. 


5.2. LemMA. If X is Peanian the order of A is finite for every AC(C(X) if 
and only if X is a graph. 

Proof. We need only show that if X is not a graph, X contains a con- 
tinuum of non-finite order. If X contains no point constituting a continuum 
of non-finite order, X must contain an infinite sequence a; of ramification 
points, and we can suppose a;—a. If there exists an arc containing infinitely 
many of the a; this arc is of non-finite order. Otherwise, we can choose in- 
finitely many arcs a;,a, forming a null sequence and with each a;, contained 
in only one arc of the sequence. Then >. .@;,@ is a continuum of non-finite 
order. 

If A is a closed subset of X, @(X, A) is the subset of @(X) consisting of 
continua which contain A. If AG C(X) then AG C(X, A). Also @(X,0) =C(X). 


5.3. Lemma. If X is Peanian, then for every AE(C(X) we have order A 
Sdim, A). 


Proof. If A€C(X) is of order n, then, using the »-Bogensatz(*), we can 
choose arcs B,,--- , B,, B;-A =a; and (B;—a;) a collection of disjoint sets. 
Toeach te, -- + XB, assign the continuum A +) 
This correspondence is a homeomorphism and the theorem is proved. 

5.4. THEOREM. If X is Peanian then dim @(X)< © if and only if X isa 
linear graph. 

Proof. If dim @(X) is finite then 5.3 and 5.2 imply that X is a linear graph. 
The other half of the theorem is contained in the following sharper statement. 

5.5. THEOREM. If X is a connected linear graph then 


dim (order A) 


= 2+ > (order a — 2), 


the last summation being extended over all points a©X such that order a=2. 


Proof. Let A1, Az, - - - , Am be the collection of connected sub-graphs of X. 
With each A; there is associated the collection e4; of continua in X for which 
A; is the maximal sub-graph. Clearly, @(X) is the sum of the ¢4;. If the order 
of A; is n, then there are, say, m 1-cells containing a single 0-cell of A; and 
k 1-cells containing 2 0-cells of A;, where m+2k=n. By the argument used 
in 5.3, we see that ¢4; is homeomorphic with the F,-set in n-space given by 
the inequalities 0S%;<1 for i=1,---, m, x2j1+%2;<1 for 7=1,---, k. 
Since is an F,(**), 

(*) See “n-Bogensatz,” K. Menger, Kurventheorie, p. 216. 

(?*) See “Summensatz,” K. Menger, Dimensiontheorie, p. 92. 


4 
if 
i 


1942] HYPERSPACES OF A CONTINUUM 


dim 3 ma (dime/;) = (orderA;) S max (order A). 

The other necessary cat is contained in 5.3. 

The equality (order A) = (order a—2) can be obtained by 
a simple induction argument. 

Remark. If X is a linear graph @(X) is actually a polyhedron. We have 
also the property: If X is Peanian and @(X) has finite dimension at every one 
of its points then @(X) must have finite dimension. 


6. Local separating points. In this section we prove a scien of G. T. 
Whyburn. 


6.1. THEOREM. If X is Peanian, A any closed subset of X,aG@X—A, then 
a is a local separating point of X if and only if @(X, A+a) contains interior 
points relative to ?(X, A). 


Proof. First, let a@ be a nonlocal separating point, a@X—A. For 
BEC(X, A+a) and e>0 choose a connected neighborhood U of a of di- 
ameter less than ¢ so that 7-A =0. Choose a neighborhood V of a, VCU, 
such that U— V is connected. Then (B+ U— V)EC(X, A) -—C(X, A+a) and 
is at most ¢ distance from B. Hence @(X, A)—@(X, A+a) is dense in 
C(X, A). 

Second, let a be a local separating point of X and U a connected neighbor- 
hood of a such that U—a=Ui+ U2, 01: 02=a. Let V be a connected neigh- 
borhood of a with VC U. Choose a continuum BDA +V and intersecting the 
boundary of only one of U; and U2 in points other than a. Any continuum 
sufficiently near B intersects both V- U; and V- U2 and fails to intersect the 
boundary of one of U; and U; in a point different from a. Hence a is a point 
of this continuum and B is interior to @(X, A+a) relative to @(X, A). 

Remark. If X is non-Peanian and a is a local separating point then @(X, a) 
contains interior points relative to @(X). The converse is not necessarily true, 
however. 

If A is the null set we have this corollary. 


6.2. Coroiiary. If X is Peanian, a€X then a is a local separating point 
if and only if @(X, a) contains interior points relative to C(X). 


6.3. THEorEM (G. T. Whyburn("")). If X is Peanian and a,EX is a se- 
quence of nonlocal separating points, then X* =X —) a; is connected and locally 
connected. 

In fact, if bi, bxEX*, and b; and bz can be joined in X by a continuum of 
diameter less than ¢ then the same holds in X*. 


Proof. The set []7°(@(X, b:+b:)—(@(X, bi+b:+a,)) is by the theorem of 


(41) Semi-closed sets and collections, Duke Mathematical Journal, vol. 2 (1936), pp. 684-690. 
The above theorem is contained in Theorem 3.2 of the paper cited. I owe this proof to 
S. Eilenberg. 


31 


32 J. L. KELLEY (July 


Baire, dense in @(X, 1+), since by 6.1 each set in the product is dense and 
open in @(X, b:+52). Hence any continuum about 6:+4: is the limit of con- 
tinua about 5:+5, in X*. The theorem follows. 


7. Continuous transformations. Here we show that for a continuous trans- 
formation f(X)= Y we may under certain conditions find X9oCX, with Xo 
closed and of dimension 0, such that f(Xo) = Y. 


7.1. Lemma. If f(E*)=£E' is a continuous mapping of the unit square onto 
the unit interval, then there exist two disjoint arcs ab and cd in E*, each containing 
at most one boundary point of E*, such that f(ab+cd) = E'. 


Proof. The interior of E* maps into a connected set which is dense in E'. 
Choose a€&f-'(0), bE f-(2/3), cE f-(1/3), dE f-*(1) so that and c do not 
belong to the boundary E?. Choose ad and cd disjoint arcs in E* having at 
most a and d in common with the boundary of E*. Then f(ab)>(0, 2/3) and 
f(cd)D (1/3, 1). 


7.2. THEOREM(!*), If f(E*) = E' is a continuous mapping of the unit square 
onto the unit interval then there exists a closed totally disconnected subset Z of E* 
such that f(Z) = E}, 


Proof. Let <4 be the subset of 2¥’ consisting of all subsets of E* which 
map onto E! under f. Let ¢4, be the subset of e4 consisting of sets having 
only components of diameter less than ¢. Clearly <4, is open in 4, and we 
shall show ¢4, is dense in e4. Since a residual set in a complete space js non- 
vacuous, it will be true that []e41;,0, and any AE] [e41,, will be a totally 
disconnected closed set mapping on E', 

Suppose and e>0 are given. We shall find BEc4,, p'(A, B)<e. 
Choose a subdivision of E* into closed squares Si, S2,- ++, S,, each of di- 
ameter less than ¢/4. For each S, which intersects A choose arcs a,b, and 
c,d, by 7.1, each mapping onto f(S,), and let B be the sum of the arcs so 
chosen. Since dia S,<¢/4, B has only components of diameter less than e. 
Since B intersects those and only those squares S, which are cut by A, 
p'(A, B) <eand f(B) DE". Hence BE, and the proof is complete. 

We now obtain a similar theorem with more general space and more spe- 
cial type of transformation. First, consider a transformation f(X) = Y where 

7.3. (a) X is compact and metric and dim Y<o. 

(b) f ts monotone and interior ("*). 

(c) dia f-'(y) >0 for all yvEY. 


(#*) I owe this theorem to S. Eilenberg and L. Zippin. . 

(4) A transformation is monotone if the inverse of every point in the image space is con- 
nected. See R. L. Moore, Foundations of Point Set Theory, American Mathematical Society 
Colloquium Publications, vol. 13, 1932, chap. 5. The term “monotone” is due to C. B. Morrey, 
American Journal of Mathematics, vol. 57° (1935), pp. 17-50. A transformation is interior if 
open sets map into open sets. For references see G. T. Whyburn, Duke Mathematical Journal, 
vol. 3 (1937), pp. 370-381. 


a 
2 


1942] - HYPERSPACES OF A CONTINUUM 33 


7.4, Lemma. Under the hypothesis of 7.3 for any AC2* where f(A) = Y and 
for any €>0 there exists BE2* such that 

(a) p\(A, B)<e. 

(b) f(B)= ¥. 

(c) Every component of B is of diameter less than e. 


Proof. It is sufficient to find BC V,(A) and satisfying (b) and (c) since 
by adding a finite number of points to such a B we may obtain a set within 
of A. Let Vo= V./2(A). We shall need these three lemmas: 


7.5. Lemma. There exists r(€)>0 such that f(V.(x)) D> Vere (f(x)) for every 
xEX. 


7.6. LEMMA. There exists d>0 such that for any yEY there is a component 
Ay, of Vo-f-(y) such that dia Ay2d. 


7.7. Lemma. There exists an integer N such that Y allows an arbitrarily fine 
covering by open sets, Wi, -++, Wm such that at most N of the sets W; intersect 
any given W,. 


The first of these is a simple consequence of interiority, the second follows 


since dia f-1(y) >0 for all yE Y, and the third is true since Y can be imbedded 
in a finite-dimensional euclidean space. 


Let s=min [e/3, d/8N] and construct a covering of Y of the type 7.7 


with dia W,<r(s) for r=1,---,m. Let U;=f-(W,). Choose U;: Vo and 
let Ai = U;- V,(a:). Choose successively then a,€ U,: Vp and A,= U,- V,(a,) so 
that A,-A;=0 for i<r. That this is always possible is shown as follows: 
Choose yE W, and A, of 7.6. At most N of the sets A; - - - , A,_, intersect U,, 
and each A; is of dia less than 2s. If }-}-'V.(A;:U,) intersected V,(a) for 
every aGA, then >-1'V2,(V.(Ai:U,)) DA, and dia AySN-8s<d which is 
impossible. Hence it is possible to choose A, - - - , Am as prescribed. Finally, 
f(A) DW, and Let and the result follows. 


7.8. THEOREM. Let f(X) = Y be a monotone interior transformation of a com- 
pact metric space X into a set Y of finite dimension. Then there exists a closed 
totally disconnected subset Xo of X mapping onto Y if and only if the set of 
points on which f is 1-1 is a totally disconnected subset of Y. 


Proof. First, suppose f~'(y) contains more than a single point for every 
ye Y. If -4AC2 is the set of all sets mapping onto Y under f, then by 7.4 
the subset 41), of sets with components of diameter less than 1/n is dense 
in eA. Any set belonging to the residual set | [e41/. then satisfies the theorem. 

Second, suppose f~'(y) consists of a single point for all yEB, B a totally 
disconnected set. Since f is interior, B is closed. Let V,= V1;,(B) and using 
the result of the previous paragraph choose A,, closed, totally disconnected 
and mapping on — V,41. Then Xo =>-A,+f-(B) is easily seen to be totally 
disconnected and maps onto Y. 


- 


34 J. L. KELLEY x {July 


Finally, if f is 1-1 on a continuum, it is clearly impossible to find Xo satis- 
fying the theorem. 

8. Knaster continua. A compact metric continuum is indecomposable if it 
cannot be written as the sum of two proper subcontinua. 


8.1. Lemma. If X is indecomposable and Aap is an arc in @(X) with 
=X then 


Proof. Let C be the first element in order from A to B such that 
o(Aac) =X. For each Ci preceding C the continuum ¢(44¢,) is contained in 
the composant about A of X, and hence o(e4¢,c) contains points both in 
this composant and in its complement. Thus ¢(4c,c) =X for all C; preced- 
ing C, and therefore C=X. 


8.2. THEOREM. In order that X be indecomposable it is necessary and suffi- 
cient that @(X)—X fail to be arcwise connected. 


Proof. If X is indecomposable then for any arc -4,4z where A and B lie 
in different composants of X we have o(¢44s) =X and hence XEceA,s. Thus 
@(X) —X is not arcwise connected. 

If X is not indecomposable write X =Ai+A2, A:GC(X), for 
i=1, 2. If BEC(X), B¥X, and a€B-A;-A; then there exists a segment 
joining {a} to B, and also segments joining {a} to both A: and Az. If BCA; 
there is a segment fronr B to A1. In any event B can be joined by an arc to 
both of A: and Az in @(X)—X and the theorem is proved. 

A compact metric continuum is a Knaster continuum(") if every subcon- 
tinuum is indecomposable. If X is a Knaster continuum and if A, BE@(X), 
then either AB=0, ADB or BDA. Hence: 


8.3. Lemma. Jf X is a Knaster continuum, A, BEC(X), AB#¥0 and 
then A=B. 


8.4. THEOREM. The continuum X is a Knaster continuum if and only if 
C(X) contains a unique arc between every pair of its elements. 


Proof. If @(X) contains a unique arc between every pair of elements then 
for any AE C(X), C(A)—A must fail to be arcwise connected and hence by 
8.2 indecomposable. Therefore X is a Knaster continuum. 

Suppose X is a Knaster continuum and e4,4 an arc in @(X). Since A438) 
is indecomposable by 8.1 we have ¢(442)€e44z. Hence the function pu as- 
sumes a unique maximum on any simple arc, and if C=o(e448) then uw must 
be strictly monotone on each of e44c and Acg. For CiG€e4a¢ we then have 
C:=0(Ac,). It follows that e44c and eAcz are, with proper parametrization, 


(*) The only known example of a continuum of this type was given by B. Knaster in his 
dissertation, Un continu dont tout sous-continu est indécomposable, Fundamenta Mathematicae 
vol. 3 (1922), pp. 247-286. 


one 

4] 
ay 


1942] HYPERSPACES OF A CONTINUUM 35 


segments. From 8.3 we see that there exists a unique continuum containing A 
at which u assumes any specified value. Hence the arc e448 is unique. 


8.5. THEOREM. If X is a Knaster continuum, for every «>0 there exists a 
monotone interior transformation f(X)=Y such that 0<dia f-'(y)<e for all 
ye Y. 


Proof. Choose d>0 such that if u(A)=d then dia A <e. For each a€cX 
there is, by 8.4, a unique A(a)€C(X) such that A (a) and u(A(a)) =d. If 
A(a)-A(b)#0 then by 8.3 A(a)=A(bd). If lim a;=a, then since yu is continu- 
ous and A (a) single-valued lim A(a;)=A(a). The map A (qa) is then a con- 
tinuous monotone interior transformation of X into @(X) and satisfies the 
conditions of the theorem. 


8.6. THEOREM. If X is a Knaster continuum and if there exists, for every 
€>0, a monotone interior transformation f(X) = Y such that: 

(a) 0<dia <¢ for all yYEY; 

(b) dim Y< a, 
then dim X = 1. 


Proof. Under the hypotheses of the theorem we shall exhibit an €-covering 
of order 2 of X by closed sets. Choose X»CX, by 7.8, closed, totally discon- 
nected, with f(Xo) = Y. Let U be an open set about Xo so that the diameter 
of any component of TJ is less than e. Every component of X — U is of di- 


ameter less than e¢, for if AC X — U, dia A then for aE A, f-'(f(a)) CA. But 
this contradicts the fact that f(Xo)= Y. Write each of U and X —U as the 
sum of a finite number of closed disjoint sets of diameter less than e. The re- 
sulting covering of X is surely of order 2. 

From 8.4 and 8.6 and the fact that the monotone image of a Knaster con- 
tinuum is also a Knaster continuum we have: 


8.7. THEOREM. If X is a Knaster continuum of dimension greater than 1 
then: 

(a) for every €>O there is a monotone interior transformation f(X)=Y 
0<dia f-'(y) for all yEY, with dim Y= ~; 

(b) there exists an €>0O such that for any monotone interior f(X)= Y, with 
0 <dia f-'(y) for all yEY, it is true thatdim Y= ~; 

(c) there exist Knaster continua of infinite dimension. 


Remark. Theorem 8.6 could be demonstrated without the restriction (b) 
on dimension if instead of 7.8 we had at at our disposal the theorem: Jf 
f(X) = Y is monotone interior then there exists Xo, closed in X, with f(Xo)= Y, 
such that Xo-f-*(y) is totally disconnected for all yEY; that is, such that f is 
light on Xo. This statement is much weaker, except for restriction on dim Y, 
than 7.8, and its truth would imply that every Knaster continuum is of di- 
mension one. 


J. L. KELLEY 


BIBLIOGRAPHY 


1. K. Borsuk and S. Mazurkiewicz, Sur l’hyperespace d’un continu, Comptes Rendus des 
Séances de la Société des Sciences et des Lettres de Varsovie, vol. 24 (1931), pp. 149-152. 

2. K. Kuratowski, Topologie, p. 92. 

3. S. Mazurkiewicz, Sur les continus absolument indécomposables, Fundamenta Mathe- 
maticae, vol. 16 (1930), pp. 151-159. ; 

4, , Sur l'hyperespace d'un continu, Fundamenta Mathematicae, vol. 18 (1932), pp. 
171-177. 

5. , Sur le type C de l’hyperespace d'un continu, Fundamenta Mathematicae, vol. 20 
(1933), pp. 52-53. 
6 , Ein Zerlegungsatz, Fundamenta Mathematicae, vol. 23 (1934), pp. 11-14. 

7. L. Vietoris, Kontinua sweiter Ordnung, Monatshefte fiir Mathematik und Physik, vol. 33 
(1923), pp. 49-62. 

8. T. Wazewski, Sur un continu singulier, Fundamenta Mathematicae, vol. 4 (1923), pp. 
214-235. 

9. M. Wojdyslawski, Sur la contractilité des hyperespaces de continus localement connexes, 
Fundamenta Mathematicae, vol. 30 (1938), pp. 247-252. 

10. , Rétractes absolus et hyperespaces des continus, Fundamenta Mathematicae, vol. 
32 (1939), pp. 184-192. 


UNIVERSITY OF VIRGINIA, 
CHARLOTTESVILLE, VA. 

UNIVERSITY OF NoTRE DAME, 
Notre Dame, IND. 


36 
i 


TOPICS IN THE THEORY OF MARKOFF CHAINS 


BY 
J. L. DOOB 


Let P(t): (p«;(¢)) be a matrix (finite- or infinite-dimensional), depending 
on t>0, whose elements satisfy the following conditions 


(0.1) pit) 20, pit) = 1, P(s)P() = P(t)P(s) = P(s +2). 
i 


Then #;;(¢) can be considered a transition probability of a Markoff chain: A 
system is supposed which can assume various numbered states, and ;;(t). is 
the probability that the system is in the jth state at the end of a time interval 
of length ¢, if it was in the ith state at the beginning of the interval. The pres- 
ent paper will be divided into two parts. In the first, the regularity properties 
of P(t), and its asymptotic properties as t-—+0, t+ © are studied. These prob- 
lems have been solved in the finite-dimensional case by Doeblin('). In the 
infinite-dimensional case new situations can arise, and the results are some- 
what different. The method of approach’ is new, depending on two theorems 
(Theorems 2 and 3) concerning matrices whose elements are non-negative, 
and which have row sums less than or equal to 1. The method of approach 
can also be applied to the study of the asymptotic properties of the powers 
of a matrix of non-negative elements, with row sums 1. In the second part 
of the paper, the actual transitions connected with Markoff chains are in- 
vestigated : That is, the properties of the function £(#), the number of the state 
which the given system assumes at time /, are investigated. The continuity 
properties of &(#) are analyzed, and related to thé regularity properties of 
the p;;(t). 


LEMMA 1. Suppose that the function f(t), defined for all t>0, satisfies the 
functional equation 


(1.1) +8) = DY gals) halt) (s,# > 0), 


where g,(s), hn(s) are defined for s>0, where h,(s) is measurable, and where for 
each fixed s, if 0<a<b, the series converges uniformly for asStsb. Then f(t) is 
continuous, for all t>0. ( 


Presented to the Society, December 31, 1941; received by the editors May 24, 1941, 

(*) Bulletin des Sciences Mathématiques, (2), vol. 62 (1938), pp. 21-32, and vol. 63 (1939), 
pp. 35-37. In the following, these papers will be referred to as Doeblin (I). Fréchet has discussed 
the solutions of (0.1) in great detail, in the finite-dimensional case, with full references to earlier 
authors in his book Traité du Calcul des Probabilités et de ses Applications, vol.'1, Part 3, Book 2, 
Méthode des fonctions arbitraires - - + , Paris, 1938. 


37 


n 


38 J. L. DOOB {July 


It will be sufficient to prove that if a value fy of ¢ is given, and if { 5;} is 
a sequence of numbers approaching 0, then f(to+5.,)—f(to), for some sub- 
sequence {8,,} of {8;}. By a theorem of Auerbach(?), there is, corresponding 
to each h(t), a subsequence { of { 5;}, such that 
(1.2) lim + h,(t) 


ine 


for almost all ¢ in the interval 0 <¢<%. There is then, using the diagonal proc- 
ess, a subsequence { 5.,} of { 8;} such that (1.2) is true for all n, O0<t< i, 
except possibly on a ¢-set of measure 0. If 0<¢<%, and if j is large, 


(1.3) + = gnlto — thn(t + 


and if ¢ is not in the exceptional set, (1.3) implies, when j ~, 


(1.4) S(to + 82;) > gnlto — = f(to) 


(because of the uniform convergence of the series in (1.3) with respect to j), 
as was to be proved. } 


THEOREM 1. If the matrix function P(t) satisfies (0.1), the measurability of 
the pi;(t) implies their continuity. 


This follows at once from Lemma 1. It has been shown by Doeblin (I) and 
it will be a corollary of results to be proved below, that the 9;;(¢) satisfying 
(0.1) are always continuous if the matrix P(t) is finite-dimensional, even if 
measurability is not assumed. The following example shows that there are 
non-measurable solutions of (0.1). 

In this example, the ~;;(¢) take on only the values 0, 1, and P(t) is a per- 
mutation matrix. Hamel has shown that there is a function f(x), defined for 
all real x, taking on only rational values, and satisfying the functional equa- 
tion(*) f(x+y) =f(x)+f(y). Let {r,} be an enumeration of all the rational 
numbers, and let T, be the transformation of these numbers taking 1; into 
r;+f(s). The transformation can be represented by a matrix P(s): (p:;(s)), 
where ?;;(s)=1 if T.7;=7;, and ~;;(s)=0 otherwise. Then evidently P(s+-#) 
= P(s)P(t), and (0.1) is satisfied. The functions ;;(¢) are not measurable, 
since they obviously are not continuous. 

The following theorem describes completely the solutions of (0.1) which 
are independent of ¢. It will be useful to weaken (0.1) slightly. The theorem 
is essentially known, at least in an indirect form(‘). 


(?) Fundamenta Mathematicae, vol. 11 (1928), pp. 196-197. 

(2) Mathematische Annalen, vol. 60 (1905), pp. 459-462. To ensure that Hamel’s f(x) take 
on only rational values, we can set, using his notation, f(a) =1, f(b)=--- =0. 

(*) Cf., for example, K. Yosida and S. Kakutani, Japanese Journal of Mathematics, vol.16 
(1939), pp. 47-55. 


| 
n 
n 

4 


1942 THEORY OF MARKOFF CHAINS 39 
THEOREM 2. Let U: (u;;) be a matrix of elements satisfying the following 
conditions : 
(2.1) = 0, 1, U2 = U. 
i 


Then the subscripts can be divided into mutually exclusive classes(®) F, Gy, Ge, «+ - 
such that 


(a) usj=0, if jEF; 
(b) there are positive numbers u;,j&F, such that(®) 


= j, (if i I,j J), 


= = 1 (GEl=J); 


(c) there are non-negative numbers { pis} such that 
Uizg = 
Conversely, if the u;; satisfy (a), (b), (c), then (2.1) is true. 


Suppose (2.1) is true. Define F as the set of integers 7 with p;;=0 for all i. 
Then (a) is true. Unless U is the null matrix, there will be subscripts not in F. 
The extreme members of the inequality 


k isk k i 


are equals so if jx <1, ui;=0:7EF. Let - - be any numbers satisfy- 
ing the conditions: 


(2.3) Lltil< 0, DL = & (all j). 
i 
Then £;=0 if jE F. If G is the set of integers j for which £;>0, 


and there is an impossible inequality unless u;;=0 whenever §;<0, £;>0. 
Let i, 7 be any two distinct integers not in F. Unless the ith and jth columns 
of U are proportional (neglecting elements in the columns whose first sub- 
scripts are in F), there are integers 7, s (GF), such that 


(*) The class F may be absent, or the G, may be absent. The latter case will arise when and 
only when JU is the null matrix. 

(®) In the following, capital letters J, J, K will be init to denote the G,, and a subscript i 
will always belong to the class J, and so on, unless the contrary is explicitly stated. The notation 
;3 is the usual Kronecker 3. 


40 J. L. DOOB 


Uri Uy 
(2.5) 0. 
Usi Usj 


Then can be chosen so that Au-;+uu,;>0. Since 
£,=Au-e+uus. provides a solution of (2.3) with §;<0, £;>0, it follows 
that u;;=0 if (2.5) is true. The subscripts not in F fall into classes, Gi, G2, -+-, 
putting subscripts in the same class if the corresponding partial columns are 
proportional ; if 7, 7 (@F) are not in the same class, u;;=0. The elements u;; 
with i in a class G, determine a matrix of rank 1. The rows of this matrix 
are therefore proportional, in fact identical, since the row sums are 1. We can 
thus write u;;= («€I, 7EJ), and 


(2.6) = = 1 (jo € J). 
i 


If iE F, jEJ, 


(2.7) = > = ( ust) = pisu 
k 


rEs 


where p;, is defined by the sum in the parentheses. If 7¢ F, u; cannot vanish, 
since the elements of the jth column (j7¢ F) cannot all vanish. We have now 
shown that (2.1) implies (a), (b), (c). Conversely, if (a), (b), (c) are true, 
(2.1) can be checked at once. 


THEOREM 3. Let It bea set of matrices (finite- or infinite-dimensional) with 
non-negative elements, and row sums less than or equal to 1. Suppose that the 
matrices in IN form a group Mt’. There is thena UCM (the identity in M’) with 
U?= U. If Uis the null matrix, U is the only matrix in M, and M’ consists only 
of the identity. If U is not the null matrix, we shall use the notation of Theorem 2 
to describe its elements. The group M’ is always isomorphic to a permutation 
group acting on the G,. If (pi;) is a matrix of M, and if the corresponding per- 
mutation takes I, into I2("), then 


= GEhjEJ), 
(3.1) Pri = Prt Mi 
pis = O Gj € F). 


Evidently if U is the null matrix, it is the only matrix in M, and M’ con- 
sists only of the identity. We shall assume from now on that U is not the null 
matrix, and use the notation of Theorem 2. Suppose that P:(p;;) EM. Then 
since P=PU=UP, 


(") As usual, letters J, J refer to the G,. 


[July 
| 
+ 


1942] THEORY OF MARKOFF CHAINS 


(3.2) pi = = Ui 


If REF, (3.2) shows that p;,.=0, the last equation of (3.1). If «EJ, REK, 
(3.2) becomes 


(3.2’) piu = ( ps) Uk, 
Pik = UrPrk- 


According to (3.2’), pis /tz depends only on i, K, and according to (3.2’’), 
pi. depends only on J, k. Then ;./u, depends only on J, K: 


(3.3) Pik = OrKUr ((€1,kE K). 


There is a P’: (pj) in M which is the inverse of Pin Mt’. If we write py =ojKur, 
for 1, 7&F, the equation U=P’P implies 


J 


The of; are non-negative and 


(3.5) ow = S 1. 
J k 
If J=K in (3.4), we obtain 
(3.6) 1= < 1. 
J J 


There must be equality throughout in (3.6); therefore if a7; <1, it follows that 
oj;=0. The matrices (¢77), play symmetric roles; so if o4;<1, it follows 
that o77=0. Then if o77 <1, o5;=0<1; so o77=0. Each element in the matrix 
(or) is either 1 or 0, and by (3.6) there is a 1 in each row of (¢4;) and therefore 
in each row of (o7,). If o7,1,=1, the matrix (¢7;) defines the permutation of . 
the G, taking J; into J2. The matrix (¢/,) defines the inverse of this permuta- 
tion. Equation (3.3) becomes the first equation of (3.1), equation (3.2) implies 
the second equation of (3.1), and the third equation of (3.1) has already been’ 
verified. The equations of (3.1) induce an isomorphism between the permuta- 
tions defined by the (077) permutation matrices and I’. 


CoROLLARY 1, Suppose in Theorem 3 that M contains its limit matrices(®). 
Then che corresponding permutation group on the G, has the property that each G; 


(*) The matrices {M): (ms) } will be said to converge to M:(m;), M™—M, if mP 
for all 7, 7. The limit matrices of It are matrices which are limits of convergent sequences of 
matrices in 


41 


42 J. L. DOOB ; [July 


can go only into a finite number of the G,. If in addition it is supposed that cor- 
responding to each ACM and positive integer n there is a BEM such that 
B"=A, then M consists of only a single matrix, of the type described in Theo- 
‘rem 2. 


Suppose that J contains its limiting matrices, and that some G,, say Ga, 
goes into infinitely many G, under the permutations of the group. Then there 
is a limiting matrix (p;;) of M such that p,;=0 if 1©G,. But a matrix with 
these rows of zeros cannot be in Jt, so G. cannot have the supposed property. 
The first part of the corollary is thus proved. Now suppose both hypotheses of 
the corollary are satisfied. It will be sufficient to prove that the group of per- 
mutations on the G, is the identity. Let G, be any G,. We have already shown 
that G, can go only into a finite number of G,, say Ga,, - - - , Ga;, under the 
permutations of the group. The permutations then permute G,,,---, Ga; 
among themselves, and any element of the group of permutations on 
Ga,,* ++, Ga; has order a factor of j!. But any element in this group of per- 
mutations is by hypothesis the j7!th power of some other element; it must 
therefore be the identity. Then j=1, and G, is transformed into itself by 
every permutation of the group, as was to be proved. 


COROLLARY 2. Any matrix function P(t): (pi;(t)) with measurable elements 
Di;s(t) satisfying (0.1) forall t (including 0 and negative values) is independent 
of t: P(t)=U, where U has the properties described in Theorem 2. 


We can assume that some P(#) is not the null matrix, or there would be 
nothing to prove. The matrices P(t) form a family I satisfying the conditions 
of Theorem 3. Moreover each #;;(¢) is continuous, if ¢>0, by Theorem 1, and 
so for all ¢, from (0.1). Using the notation of Theorem 3, if iG F, p;;(¢) =u; or 
pi;(t) =0. Then if i€ F, p;;(t) is independent of t. This means that I’ consists 
only of the identity, so P(t) is independent of ¢: P(t)= U. The example above 
shows that the measurability of the ;;(¢) is a necessary part of the hypotheses. 


THEOREM 4. If the p;;(t) satisfying (0.1) are continuous, then lim:.o P(t) = U 
exists. The matrix U is a non-null matrix of the type described in Theorem 2, 
and (°) 

(4.1) UP(t) P(é)U = P(t). 
(In the following we shall use the notation of Theorem 2.) Moreover 


(4.2) pis(t) = 0 G €F). 


There are continuous functions Tl;;(t), satisfying (0.1) and 


(*) An inequality between two matrices is defined to mean the same inequality between 
their corresponding elements. 


4] 
‘ 
q 
4 


1942] THEORY OF MARKOFF CHAINS 
(4.3) lim = 613 
0 


such that 
(4.4) pit) = rs (tu; GET,jEJ). 
There are continuous functions T1,z(t) (¢€ F) such that(*) 


Pit) = u; 7 EF), 
(4.5) 
Tix(t) = pis 
J 


Conversely, if the p;;(t) satisfy (0.1) and if lim:.o P(t) exists, the p;;(t) are con- 
tinuous. 


Neglecting subscripts in F, this theorem reduces the study of P(t) to that 
of (Il,(¢)) in which case the limit matrix (t—0) is the identity. 

Let U:(u;;), U’:(usj ) be limiting matrices of P(t), +0. Then (0.1) im- 
plies (4.1). The equal ith row sums in (4.1) are 


(4.6) = pal) = 1. 


Since the row sums of U are less then or equal to 1, (4.6) implies that if 
Diva <1, pi;(t)=0. Then in this case u;;=1j,=0 also. It follows from (4.1) 
that 


(4.7) Dd S (U'U s U’). 
i 

Summing over k, since u4,=0 if }-iuj<1, we see that both sides of (4.7) 

have sum )-,u; so there is equality in (4.7): 

(4.7’) U'U = U’. 

Replacing U by U’ in the inequality UP(¢) s P(t), and letting ¢ approach 0 in 

such a way that P(t)—U, we obtain 

(4.8) U'U s U. 


Then combining (4.7’) and (4.8), we have U’S U, and by symmetry USU’; 
so U= U’. There is thus only one limiting matrix U: P(t) U. Since equation 
(4.7’) becomes U?= U, Theorem 2 is applicable. In the following, we shall 
use the notation of that theorem. If REF, uy =0; therefore (using (4.1)) 
pi(t) =0 also, for all i. Then U is not the null matrix. If i, kG F, (4.1) implies 


(4.9) (= = (k EK), 


(2°) If the G, contain only one subscript each, so that (t-+0) if then we can 
read ;;(t) for for throughout. 


‘ 


44 J. L. DOOB [July 


and 


(4.10) uipir(t) S pir(2) D). 

There is equality in (4.10) (and we shall refer to it as if it were so written), 

because summing over k gives 1 on both sides. Equations (4.9) and (4.10) im- 

ply that if REF, depends only on J, K: =Trx(t)ux. Evi- 

dently the matrix function (II,,(#)) satisfies (0.1), and (4.3) is true. If iE F, 

and k& F, (4.1) implies 


*(4.11) D> pis S ( = pir(t), 
J 


so that if II;x(¢) is defined as the parentheses in (4.11), (4.5) follows at once. 
Conversely, if (t-0), 


(4.12) lim P(s + 1) = P(S)U. 


The function 9;;(¢) having a right-hand limit for all ¢ has at most de- 
numerably many discontinuities, is therefore measurable, and continuous 
(Theorem 1). 


THEOREM 5. Let a be a given subscript. Then if P(t) satisfies (0.1), pa;(t) 
will be continuous and lim;.o paj(t) will exist, for all j, if >. iPaj(t) converges uni- 
formly in some interval 0<t<to. 


Doeblin (I) proved that if P(t) satisfies (0.1) and is finite-dimensional, 
then the ;;(¢) are continuous and have unique limits as 0. This fact which 
evidently is a consequence of Theorem 5, can be proved directly as follows. 
Let M be the set of limiting matrices of P(t), t-+0. Then M satisfies the con- 
ditions of Theorem 3, Corollary 1, so Mt contains only a single matrix U. It 
follows that P(#)—>U, and the 9;;(¢) are then continuous, by Theorem 4. 

Proof of Theorem 5. Let G be the set of subscripts a with the property 
described in the theorem. The equation P(s)P(t)=P(t)P(s) implies that if 
A:(a;;) is a limiting matrix of P(¢), then 


(5.1) S Pis(t)a jx. 


If iG, then }\ja;;=1, and the sum over & on the left is 1, so that on the right 
is also 1. Then there is equality in (5.1): 


If Dien <1, then ~;;(¢)=0, or the sum over k on the right in (5.1’) would 
not be 1. If 7G, we can find an A with Dien <1, whence it follows that 


- 
4 
2 


1942] | THEORY OF MARKOFF CHAINS 45 


pis(t) =0 if 1€G, 7EG. Let P’(t) be the matrix obtained by dropping all ele- 
ments of P(t) with a subscript not in G. Then P’(t) satisfies (0.1) and has the 
property that any limiting matrix (¢—>0) has row sums 1. The proof given 
above of Doeblin’s result goes through word for word, applied to P’(t). We 
have thus proved that P;;(#) is continuous, and ;;(t) exists, if a€G, 
and in addition that 9;;(¢)=0 if jEG. 

We now turn to an examination of the limiting matrices of P(t), as t+ ©. 


THEOREM 6. Define the matrix U:(u;;) by 
(6.1) lim inf pij(t) = Ui;. 


Then 

(a) Uis a limiting matrix of P(t), as t-«; U has the properties described 
in Theorem 2, and P(t)U=UP(t)=U; 

(b) (6.1) can be sharpened to 
(6.1’) lim pi = ui; 

if iis a subscript such that >. jui;=1("). 

(c) Using the notation of Theorem 2, and assuming that U is not the null 


matrix, 
= 0 GELzZED, 


(6.2) Urpr s(t) = 


L Pit) + LD pisoix = pix (i€ F). 
jGK 


If iG F or if jE F (6.1’) is true. If iE F, pi;(t) is continuous, and lim:.o pi;(t) 

exists. Moreover 

(6.3) lim pic(t) = pix, lim = 0 (i € F). 

The fact that if P(t) is finite-dimensional (6.1’) is always true, which fol- 
lows from Theorem 6, can be proved directly as follows. The set of limiting 
matrices (in this case) of P(t), t+, is seen at once to have the properties 
required in Theorem 3, Corollary 1, so there is only one limiting matrix U: 
P(t)—U. This argument breaks down in the infinite-dimensional case, in 
which a more detailed analysis is necessary. 

Let 2 be the class of limiting matrices (a;;) of P(t), as t+. Then & in- 
cludes all its limit matrices. This implies that 2 contains one or more matrices 
minimizing );,;2-‘a;;. Let M be the class of these minimal matrices. We 
shall show that It contains only one matrix, U, defined by (6.1). The proof 
will be carried through in several steps. 


(#4) It follows that if the matrices are finite-dimensional, (6.1’) is true for all 7, 7, a fact 
due to Doeblin (1). 


‘ 


J. L. DOOB 


(i) If AEX, then 
This can be deduced at once from (0.1). 
(ii) If AEM, then AP(t)=P(HNACM. 
This follows, for (0.1) implies that if A:(a:;)EM, there is a B:(b,;)EX, 
depending on / and on A, such that 


Summed over k, this means that 
7 k 


and only equality is possible, since ACM. Then (6.4) becomes the equality 
BP(t)=P(t)B=A. Therefore, 


AP(t) = P(t)BP(t) = P(A. 


By (i), P(t#)A and summing over the ith row of P(t)A =A P(t) gives 
so P(t)A EM, since AEM. 

(iii) If A, BEM, then AB=BACM. 

By (ii), if ACM, it follows that AP(NEM. If BEM, AB is a limiting - 
matrix of AP(t), t>©, so ABEM since M is closed. Moreover by (ii), 
AP(t)=P(t)A,so AB2BA. By symmetry, the reverse inequality is also true; 
so AB=BA. - 

(iv) If thereis an A’EM with A'SA. 

For if A E&, there is, using (0.1), a BEM and a CEL such that A2jBC, 
and since BCE M (as a limit of BP(#) EM) this is the desired inequality. 

(v) If A, BEM, thereisa CEM with A=BC. 

We see this, for there is certainly, using (0.1), a CE% with A2BC. As 
we have seen, BCE M, so there must be equality. 

(vi) If AGM, and if n is any positive integer, there is a BEM such that 
A=B"., 

For, since P(t/n)"= P(t), if ACM, there is a B,C such that BSA. By 
(iv) there is then a BEM with B*sSBisA. To show that there must be 
equality, we need only show that B*E&. Since BEM, BB = B*EM by (iii). 
Then BB?= B*EM, and so on. 

Now (iii) and (v) imply that the matrices of 2 form a commutative group 
The fact that Mt is closed and that (vi) is true shows that Mt has the proper- 
ties required in Theorem 3, Corollary 1. There can therefore be only a single 
matrix U:(u;;) in M, and U has the properties described in Theorem 2. Be- 
cause of (iv), lim inf;... £:;(¢) = u,;. From now on we shall assume the notation 
of Theorem 2. The equality P(t)U = UP(t) = U follows from (ii). If }> ja;=1, 
no limiting value of p.;(t), > ©, can be greater than u,;, or there would be a 
limiting row having a sum greater than 1. Then if iaj=1, Paj(t) = 
for all j. In particular, (6.1’) is true if iG F. The equations of (6.2) are equiva- 


46 [July 


1942] THEORY OF MARKOFF CHAINS 47 


lent to the equations P(t)U = UP(t) =U. If (a;;) is a limiting matrix of P(t) 
as t-0, (6.2) implies that 


(6.6) = G EJ). 
res 


Summing (6.6) over 7€J we obtain 


res 


Then ;a,;=1 if rE J. This implies that ;p,;(t) converges uniformly in some 
interval 0<t¢<%; so according to Theorem 5, #,;(t) is continuous, and has a 
unique limit as 0, if r© J. Then this is true for any subscript r¢ F. As t— 
in the last equation of (6.2) the first sum has an inferior limit greater than 
or equal to pix. Then there must actually be convergence; the first equation of 
(6.3) is true. The second sum in the last equation of (6.2) must then approach 
0; (6.3) is true. Equation (6.3) is impossible, since lim inf;... pi(t) = pixur, 
F, REK) unless p(t) —pixuxz; so (6.1’) is true if 7&F. The proof of the 
theorem is now complete. 

Regularity hypotheses imposed on the probability matrices can be used to 
simplify the above results. Thus suppose that there is a value fo of ¢ such that 
> iPis(to) converges uniformly in i. It follows readily that > ibis(t) converges 
uniformly in 7 and ¢2¢o. This means that any limiting matrix of P(t),t— ©, has 
row sums 1, so P(t)—U, by Theorem 6. A less strong condition is that 
there be a value ft of ¢, a positive integer N and a positive € such that 
Diswhis(to) 2 € for all i. It follows readily that the same inequalities hold for 
t 2 to. Then 


(6.8) un 


so there can only be a finite number of G,, and U cannot be the null matrix. 
Also if 7€ F, (6.8) becomes 


(6.8’) pixu (k K). 
kSN 
Then some p;x>0 for each F, so by (6.3), pis(t) =0, if F, 7EF. 
Thus in this case also, P(t)—>U, as t->. The fact that P(#)—+U under the 
above hypotheses can also be derived using general theorems of Doeblin(??) 
or of Kryloff and Bogoliotboff(+*). 
If there is a set of non-negative numbers fi, p2, - + + such that 


(6.9) pibislt) = (all j), 


(?*) Thesis, Paris, 1938, pp. 105-109. 
(#8) Paris, Comptes Rendus de 1’ Académie des Sciences, vol. 204 (1937), pp. 1454-1456. - 


48 J. L. DOOB ; [July 


the set f:,--~- will be called a set of (stationary) absolute probabilities. The 
number p; can be considered as the probability of being in the jth state at 
time ¢. Any linear combination of absolute probabilities with non-negative co- 
efficients is also a set of absolute probabilities, or proportional to a set. If U 
is defined as in Theorem 6, the second set of equations of (6.2) states that 
the ith row of U, if iG F, is a set of absolute probabilities. If 1G F, the ith 
row of U is a linear combination (coefficients pix) of the rows of elements 
with first subscripts not in F. Then every row of U is a set of absolute proba- 
bilities, or proportional to a set (if the row sum is less than 1). Moreover (6.9) 
implies that }>.piui;= p;; 0 any set of absolute probabilities is a linear com- 
bination (non-negative coefficients) of rows of U. The states with subscripts 
in F then always have probability 0, regardless of the absolute probabilities. 
One simple consequence of these remarks is that if there is a solution to (6.9), 
U cannot be identically 0, and some row of U is also a solution of (6.9); there 


is a solution of (6.9) determined by the equations ~;=lim:.. pa;(t), a fixed, 
not in F, 


THEOREM 7. Suppose that the p;;(t) satisfying (0.1) are continuous. Then 
if U is defined as in Theorem 6, 


1 T 
7.1 lim — = u; 
(7.1) tim 20 


T 


for all i, j. 
Let Q(T) be the matrix with general element g;;(T) : 


‘ 
qi(T) = =f. pis(t)dt. 
Since 
1 
(7.2) Pn) = pilgu(T) = pir(s)ds, 
i i t 
if U’ is a limiting matrix of Q(T), T— ©, it follows that 
U'P(t) = U’. 


Since the row sums of U’P(t) are the same as those of U’, there must be 
equality : 


(7.3) U' P(t) = P(é)U' = U’. 
According to Theorem 6, 

(7.4) UP(t) = = U. 
It follows from (7.3) and (7.4) that 


j 


1942] THEORY OF MARKOFF CHAINS 
UU=U', 
U'U s UU’ = U: 


Then U=UU'SU'=U'USU, U=U’, as was to be proved. 
The following is a simple example illustrating the fact that U in Theorems 
6, 7 may be the null matrix. L@ p;;(t) =0 if j <i, and otherwise define ;;(¢) by 
= t 
(7.6) bist) Go! 
Evidently ;;(¢)—+0, as t+ ©. There can be no stationary absolute probabili- 
ties in this case. 

In examining the successive transitions of the system, we shall assume 
that the system is initially in a state a, where a will be held fixed throughout 
the discussion. Let &(¢) be the number of the state assumed by the system at 
time ¢. Then &(¢), for each fixed value of ¢, is a chance variable: §(0)=a; 
&(t)=j with probability .;(¢) if t>0. To discuss the continuity properties 
of &(t) in ¢ we shall assume a minimum of regularity properties of P(t), to 
which we shall be led in a natural way. In order to discuss the probability 
measures under consideration, we must, as usual, find a space 0* of points w, 
a measure defined on 12*, and a one-parameter family of measurable functions 
x1(w), OS¢< @, such that the probability relations of the chance variables 
{ €(t) } become measure relations of the functions {x,(w) }. Let 2* be the space 
of all functions x(t), OSt< ©, taking on the integral values used in the 
subscripts of P(#). A probability measure on * is defined as follows. If 
O=t)<ti< +++ <t,, the conditions 


(8.1) = Gj =0,---,m) 
determine a subset of 2* and the measure of this subset is defined by 
P*{ x(t;) = = 0,--+, 

= BarePar(ts) — 1) * — 


By a theorem of Kolmogoroff("), a completely additive measure function is 
determined on 2* by these conditions. Let x,(w) be the function of w:x(t) 
which takes on the numerical value f(s) if w is the function f(t). Then the 
probability relations of the chance variables { €(t) } become measure relations 
of the measurable functions { x,(w) } : 


P*{ = j} = pai(t) 
and so forth. We shall sometimes write x(¢) instead of x,(w), so that “x(t)” can 
mean, for example: (a) a point w of 2*; (b) a function x;(w) of w; (c) a num- 


(*) Grundbegriffe der Wahrscheinlichkeitsrechnung, Ergebnisse der Mathematik, vol. 2, no. 
4, pp. 24-30. The fact that our functions assume only integral values, whereas those of Kol- 
mogoroff assume all values necessitates only trivial changes in the proof. 


(7.5) 


(8.2) 


50 J. L. DOOB [July 


ber, the value of the function x(¢) at the point ¢. When there is any danger of 
confusion, the proper meaning will be explicitly stated. The function x;(w) is 
automatically defined on any subset of 0*, and it is usually desirable to re- 
strict w to-be in a subset 2 of 2*, of outer measure 1, defining a P-measure 
on Q by setting P(A*-Q2)=P*(A*) for any P*-measurable set(**) A*. The 
probability relations of the chance variables ha } now become the measure 
relations of the functions x;,(w), wE P{ x,(w) =j} =a;(s) and so on(#*), It 
has been shown(?") that if any P*-measure is given, there always corresponds 
an everywhere dense denumerable sequence of real numbers R: {r;} such that 
if J is any open interval, and if sE/, 


(8.3) x(r;) S x(s) S L.U.B. = 1(3%), 


It has been shown(!*) that 2 can be chosen to consist of all (possibly infinite- 
valued) functions x(t) which satisfy the relation 


(8.4) lim inf x(r;) S x(¢) S lim sup x(r;) 
rj—t 


for all¢€@ R. Then if this is done, 
(8.5) G.L.B. x(r;) S x(s) S L.U.B. x(r;) 


for all sEJ, w:x(t) in Q, sharpening (8.3). Such a space © is called quasi- 
separable, and the process: that is, the combination of 2 with its P-measure, 
is called a quasi-separable process. 

A measure can be defined on the space T XQ of couples (¢, w), as the prod- 
uct of Lebesgue measure on the ¢-axis and P-measure on w. The process is 
called measurable if the function x;(w) is (¢, w)-measurable. The P*-measure is 
then said to determine a measurable process. This hypothesis on the P*-meas- 
ure is certainly a minimum hypothesis. On the other hand, there are natural 
analytic restrictions on the p;;(t). Let G, be the set of subscripts j such that 
Paj(t) #0. Only the subscripts in G, need be considered in analyzing the transi- 
tions of the system, supposed initially in state a. It follows readily from (0.1) 
that p;;(¢)=0 if 7€G., 7 The matrix P(t): pi;(t) with i, 7 EG, then satis- 
fies (0.1), and it is this matrix P(t) which is essential to the discussion. The 


(45) Cf. Doob, these Transactions, vol. 42 (1937), pp. 108-110. 

(*) P{x,(w) =j7} is to be interpreted as the Q-measure of the sét of all functions x(¢) in 2 
for which x(s) =. 

(#7) Doob, these Transactions, vol. 47 (1940), p. 467. 

(8) This equality holds for each fixed s. The w-set {x(s)<, s EI}, ¢ a P*-measurable 
function, is not P*-measurable, so each value of s must be considered separately’in (8.3), or in 
probability relations of similar type. The subspace @ is introduced in order to avoid this neces- 
sity. 

(#9) Op. cit., pp. 468-469. 


1942] THEORY OF MARKOFF CHAINS 51 


natural analytic hypotheses on P,(¢) would include the measurability of its 
elements. This, by Theorem 1, implies their continuity, and then (Theo- 
rem 4), lim;.o P.(¢) = U exists. The matrix U is the first determining factor of 
the regularity of the process. It is natural to suppose that it is the identity 
matrix. A glance at Theorem 4 shows that no other hypothesis can possibly be 
compatible with any sort of c@ftinuity in the transitions of the system. 

These considerations lead to the following formulation of a natural hy- 
pothesis to be imposed on the matrix function P,(¢). We shall denote as 
hypothesis H, the hypothesis that the system is initially in state a, and that 
if 1€Ga, lims.o pis(t) =1. Then limy.o pi;(t) = ((EG.). Since P.(t) satisfies 
(0.1), the (4, 7 EG.) will be continuous (Theorem 4). Moreover p;;(¢) =0 
if i©G.,j EGa. Then if hypothesis H, is true, and if Gz, p:;(t) is continuous 
for all j, and lim:.o £:;(¢) = 5;;, Hypothesis H, implies the continuity of p.;(t) 
for all 7, even though a may not be in G,. In fact the equation 


Pails + Pails) 


shows that ,;(¢) is continuous for ¢>s, and therefore for all ¢. If 1€©G,, then 
Pis(t) >0 for all t, and if i=a, 7EG, or if i, 7EG, then p;;(¢) =0 at most on a 
finite interval 0<¢S¢) (depending on i, 7). The first fact follows from the in- 
equality > pii(t/n)", n=1, 2,- , since lim,.. pi(t/n) =1. The second 
fact follows from the inequality ,;(¢+) 2 p:;(t)p;;(h) which implies that if 
pi;(t’) >0, then >0, for t>?’. 

The following theorem shows the relations between various hypotheses it 
would be natural to assume. 


THEOREM 8. Suppose that P*{x(0)=a} =1. Then the following three condi- 
tions on P*-measure are equivalent. 


(i) The P*-measure determines a measurable process. 
(ii) Hypothesis H, is satisfied. 
(iii) For every r>0, 


(8.6) lim P*{ x(t) = x(r)} = 1, 


In the usual language of measure theory, (8.6) states that x«,(w)—>x,(w) in 
measure. We shall prove a much stronger result below, Theorem 11. To prove 
Theorem 8, we prove that (i) implies (ii), (ii) implies (iii), and (iii) implies (i). 

Proof that (i) implies (ii). Suppose that P*-measure determines a measur- 
able process. Then it follows(?*) that for fixed h>0, e>0, P*{ | x(t-+-h) —x(t)| 
>e} is Lebesgue measurable in t, and (as h—-+0) goes to 0 in measure on any 
finite f-interval. If «<1, 


(8.7) P*{| x(t + h) — x(t)| = 2 past) [1 — 


(7°) Doob, these Transactions, vol. 42 (1937), p. 117. 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


tr 


52 J. L. DOOB ’ [July 


Since the quantity in (8.7) goes to 0 as h--0 in measure on every finite ¢-inter- 
val, and since ~,;(t) >0 if ¢ is sufficiently large, lim; p;;(h) =1 if Then 
hypothesis H, is satisfied. 

Proof that (ii) implies (iii). If hypothesis H, is true, we shall prove (8.6) 
by evaluating the probabilities involved. If 0<7<t, 


P*{ = = = 260) = 


8.8 
(8.8) = — 7) 
iGGa 


and if 0<t<r, 
(8.8’) P*{ x(t) = = 


Proof that (iii) implies (i). Condition (iii) is known (loc. cit. (#°)) to imply 
that the P*-measure determines a measurable process. 

Now the series in (8.8) is majorized by };Pa;(r), and that in (8.8’) by 
Dd ibai(t). Then the series in (8.8) converges uniformly in t. The series >. ;Pa;(t) 
is a series of non-negative continuous functions, converging to the continuous 
function 1, so there is uniform convergence in a neighborhood of r. Thus the 
series in (8.8) and (8.8’) are uniformly convergent for ¢ near 7, and when 
tr both become 3 jPaj(T) =1, as was to be proved. 


THEOREM 9. Suppose that hypothesis H, is true. Then if iGGa, 


1 — pis(t) 
t 


(9.1) lim 


=qa(st+ ) 


exists. If pis(t)=1. If qi= ©, then if ji, EGa, 
pis(t) x(t) 
lim = 0 
0 1 — pi(t) 0 1 — pis(t) 
If qi< ©, then for ji, 7 the limits 


t t 
t t 


(9.2) 


exist, and 
In the finite-dimensional case(*"), q;< © for all iGGa, and there is equality in 
(9.4). 
Let R be a denumerable everywhere dense /-set. Since when tr, 
(24) Doeblin (I) proved Theorem 9 in the finite-dimensional case. * 


+ 

a 
| 
on 


1942] THEORY OF MARKOFF CHAINS 


x(t)—>-x(r) in measure (Theorem 8), it follows that 
(9.5) lim inf +(r) lim sup x(r) (r € R) 


with probability 1 (that is, almost everywhere on 2*). Then (8.3) is satisfied. 
We shall also need the following fact: If J is any open ¢-interval, and if 
<® are points in I, with max; (¢” —#,) =4,, then if 5,0, 


= LU.B. x(r) 
rERI 


(9.6) lim L.U.B. x(¢ 
4 


with probability 1. This can be proved as follows. Because of the fact that 
when t—>r, x(t)—>x(r) in measure, it surely is true that for each r in J, 


(9.7) lim inf L.U.B. x(t”) 2 


with probability 1, and (9.7) implies (9.6), because of (8.3). In the same way 
we can prove 
(9.8) lim G.L.B. = G.L.B. x(r) 
with probability 1. 

Now let 1€G,, and choose r so that pai(r) >0. Let $:(h) be the proba- 
bility that if x(r) =i then x(r) =i for r Sr S7+h (rER). (If R is used to deter- 
mine a quasi-separable process, ¢;(/) is the probability that if x(r) =i, then 
x(t) for rStS7+h.) According to (9.6) and (9.8), if r= < --- Sh, and 
max, (4 — = 5,0, then 


P*{x(r) = +h, (rE R)} = lim P*{ = i,j 2 1} 
(9.9) 
= pes(r) lim — = 
721 
Let {€,} be any sequence of positive numbers converging to 0. To prove (9.1) 
it will be sufficient to let 0 through the sequence {¢,}, and to show that 
there is a limit, which is independent of the sequence {en}. Choose the in- 


tegers m, so that m,¢, | h. Then setting {.-T= jén in (9.9), OSjSm,, 


(9.10) lim pis(en)™ = oi(h). 


This implies that if ¢,(h) >0 
(9.11) lim m, log pii(én) = — lim h = log ¢;(h). 


We have thus shown that unless ¢;(h) =0, (9.1) is true, and 


53 


54 J. L. DOOB : [July 
(9.12) oi(h) = 


On the other hand, if ¢;(4) =0, (9.1) is true with g;= ©, and then ¢,(h)=0. 
Since 2¢:(t), implies that 9;;(¢)=1. In proving (9.2) and (9.3) we 
can assume that 9;;(¢) <1 for all ¢, since otherwise (0.1) implies that p;,;(¢)=1, 
so gi=0: in this case (9.2) is inapplicable; the first part of (9.3) is obvious, 
and the second is proved by a trivial modification of the proof below. To prove 
(9.2) we note that if 7 >0, j7#i,7EGa, * 
= — pile)” 

if ne is sufficiently small..Then if g;= ©, when n— © and e—0 so that me->#, 
(9.13) becomes 


pi ie) 
1 — pile) 


for sufficiently small ¢. When ¢-0 this gives the first part of (9.2). Similarly 
if i#j,j €Ga 


(9.14) pit) = (1 — 0) lim st sup 


is(€)” 
(9. 13’) pii(ne) p pis(e)* 2 (i- n) Pile) 
is true for sufficiently small m, and then if g;= © 
Pile) 
pile) 
for sufficiently small ¢. When ¢—0 this gives the second part of (9.2). If qi< ©, 
(9.13) implies that 


(9.14’) = (1 — 9) lien ou su 


1— 
(6) 


qi € 


(9.15) pit) = (1 — 0) 


Then 


(9.16) lim i > (1 — limsup Pile) 
€—0 € 

Since n >0 is arbitrary, this implies that lim;.o p;;(t)/t exists, and the limit is 

finite, from (9.15). Similarly equation (9.13’) implies that lim;.o p;:(¢)/t exists 

and is finite. Moreover 


1 — pis(?) 


(9.17) 


so that (9.4) is true. Equation (9.17) can also be written 


| 

‘ 


1942] THEORY OF MARKOFF CHAINS 


pis(t) 
~ 1 — pii(t) 


Then in the finite-dimensional case g;= © is impossible (since each term of 
the sum goes to 0 with ¢ if g;= ©), and (9.17) implies that there is equality 
in (9.4). 

In discussing the continuity properties of x(f) in ¢, it is usually convenient, 
because of measurability considerations, to choose a denumerable everywhere 
dense ¢-set R and then consider the functions x(r) for rE R. The continuity 
properties of x(r) can be interpreted as continuity properties of x(t), if the 
proper space 2 of the stochastic process is chosen, and this will sometimes 
be done below. 


THEOREM 10. Suppose that hypothesis H, is true. Let r be any positive num- 
ber and let R be any denumerable everywhere dense set. Then lim,., x(r) =x(r) 
(rER) with probability 1 if and only if whenever pai(r) >0, gi is finite. If qi< © 
and if pai(t) >0, 9:;/qi ts the conditional probability that if x(r) =i, and if there 
is a discontinuity of x(r) (rE R) before r+h, then there is a first discontinuity 
before 7+-h, which is an isolated discontinuity where x(r) jumps to j. 


(9.18) 


The probability that x(r)=7 and that x(r)=i for rE R, r—-h<r<r+h 
iS Pai(t —h)p;(2h). Then lim,., x(r)=x(r) with probability 1 if and only if 
lim, .09:(h) = 1, that is, if and only if g;< ©, whenever .;(r) >0. This proves 


the first part of the theorem. The second part requires a more detailed analy- 
sis. Suppose that g;< © and that p.;(7) >0. We shall evaluate the probability 
of the x(¢)-set A, determined by the following conditions: x(r) =7; x(r) = for 
rER, r>r on some interval of r-values; x(r) then jumps fo j, remaining equal 
to j on some interval of length at least 4, the jump occurring before r+h. 
Let be any positive integer, and define A,,, by 


n—2 1 
= Dale) = i520) h) = 5, 
nN n 


(10.1) 
m+1 m+1 


=j,7r+ h<r<rt 
n n 
Then 


P(Ann) = 


e hatin — g—hai(i—i/n) 


— Pas(t) 


55 
n—2 
qi 


56 J. L. DOOB [July 


Now if x(¢)€A, it follows that x(#)€A,,» for sufficiently large n, whenever 
<n: 
(10. 3) Ay Cc lim inf Any’: 


If x(t)€A,,, for infinitely many values of n, x(t)EA,, if no r-+(m/n)hER: 
(10.4) lim sup An,, C Ay. 

Then if 

(10.5) lim sup P(An,») P(A,) lim inf 


The inferior and superior limits in (10.5) are actual limits, evaluated in (10.2). 
Since the limit function of 7 is continuous, we obtain(?*), letting 7’—>7; 


(10.6) P(Ay) = poi(r)(1 — 


The probability that x(r) =7, that x«(r) =i for on some r-interval (rE R), 
and that then x(r) jumps to j where it remains for some r-interval, the jump 
occurring before r+h, is therefore 


(10.7) lim P(Ay) = Pas(r)(1 — 


and this equality is equivalent to the statement of the theorem. 

To make clear the meaning of Theorem 10, suppose that )>jqi;=qi< © 
for alli in Ga. Then if 7 >0, lim,., «(r) =x(r) with probability 1. Excluding an 
x(t)-set of 2*-measure 0, each x(¢) in the remainder A is then equal on R to 
x(r) for r sufficiently near 7. According to the second part of the theorem, 
we can make the excluded 02*-set so large that if x(¢)€A there will be a first 
discontinuity of x(r) (if any) after 7, a jump. Now, applying the second part 
of the theorem, letting 7 run through all rational numbers, we see that the 
excluded {2*-set can .be made so large that if x(#)C€A, there will be a second 
discontinuity (if there is more than one), also a jump, a third, and so on. 
These discontinuities may cluster at a point, to give x(r) a discontinuity which 
is no longer a jump. 

We shall use a somewhat indirect method in examining in more detail the 
transitions of the system, that is, the discontinuities of x(t). This method has 
the advantage of exhibiting analytically the relation between the regularity 
of the matrix function (;;(¢)) and the discontinuities of x(t). 


(22) We have tacitly assumed the measurability of A,. This is easily proved directly, or 
the above discussion can be modified, using inner and outer measures in (10.5), to furnish the 
proof that A, is measurable, besides evaluating its measure. The restriction we have made on h 
is essential for (10.4), but evidently (10.5) and (10.6) are true without this restriction. 


n— 

{ 

| 

} 

A 

j 


1942] THEORY OF MARKOFF CHAINS 57 


Let y; (for each f in some point set) be a chance variable. The family of 
chance variables {y,} will be said to have the property € if (for any natural 
number whenever <tn41, 


(11.1) Ef Yeni = 


with probability 1(*). Suppose a family {y,} has the property E, for ¢ in some 
interval (a, 6). Then if a<7 <b, t, | r implies that lim,.. y:, =y,— exists with 
probability 1, and the limit y,_ is independent (neglecting zero probabilities) 
of the particular sequence { t, }. The chance variable { yr4 } is defined similarly 
in terms of approach from above. Moreover, y,. = 4,4 =, with probability 1, 
if r is not in some set, which is at most denumerable. We shall call this set 
the set of fixed discontinuities. If R is any denumerable set, dense on (a, 5), 
yr (YER), with probability 1, considered as a function of r alone is equal to 
a function defined on (a, b), and continuous on the right at every point of 
(a, 6) not a fixed discontinuity point. It will be useful below to say that a 
family of chance variables y, has the property E* if the family y_, has the 
property €. 
It is easily verified that if T>0, and if y, is defined by 


(11.2) Pay (T t) (0 <t< T), 


then the family of chance variables has the property E(%). We shall show 
that there are no fixed discontinuities if hypothesis H, is true. To do this it 
will be sufficient to show that if ¢ is given, and if ¢,—>¢, then some subsequence 
of {y:,} converges toy, with probability 1. Since according to Theorem 8, 
%4,(w) converges to x,(w) in measure, some subsequence, x,,(w), converges to 
x:(w) with probability 1. Then 


(11.3) Yen = — tr) 


for large n, with probability 1, so that y,,—>y, with probability 1, because the 
pi;(t) are continuous if i€@G, and for each P* { x.(w) EG} =1. In a similar 
way it can be proved that if 7*>0, and if y# is defined by 


* — T*) 
11.4 += 


the family of chance variables {y*} has the property E*, and there are no 
fixed discontinuities, if hypothesis H, is true. The chance variable y# - p.;(7T*) 


(*) The notation Will be used to denote the conditional expectation 
of ¥¢,,, for given values of ¥1,, +++ , ¥1,, 4 function of the latter variables. 

(*) The properties of such a family, summarized here, are proved in the author’s paper in 
these Transactions, vol. 47 (1940), pp. 455-486. This paper will be referred to as “€.” 

(**) This fact is a result of the well known relations between conditional expectation func- 
tions, and is a special case of the fact that if {w,} is any family of chance variables, if z is a 
chance variable dependent on the w,, and if s,= E{w,, s$t;2}, (¢:=expectation of sfor w, given 
for then the family {z,} has the property €. 


(¢> 7"), 


{ 


58 J. L. DOOB , [July 


bears the same relation to the inverse process (¢ decreasing) as y,; bears to 
the given process. For each ¢, the denominator in (11.4) vanishes only with 
probability 0 (hypothesis H,). The following two regularity conditions on the 
Pi;(t) will be useful. 


ConpiTION C(8). Let B be in Ga. Then there are numbers n, 6 such that for 
alli#B in Ga, and alls <6 


(11.5) pis(s) < 1 — 9. 


ConpDITION C*(8, 7). Let B be in Ga and let + be a positive number with 
Pas(t) >0. There are positive numbers n, 6 such that if 0<s <6, pai(r) >90, 
then 


(11.5%) bails) < (1 — 


Under hypothesis H,, if 7 is fixed and 5-0 in (11.5), the inequality be- 
comes 0<1—7, and under the same circumstances, (11.5*) becomes 


< Pas(7) 


Then conditions C(8) and C*(8, 7) are certainly always satisfied in the finite- 
dimensional case, under hypothesis H,, for all possible 8 and pairs B, r 
(BEG,), respectively. 

Condition C(8) can be put in an interesting alternate form. If condition 
C(8) is not satisfied, there is a sequence of distinct integers {i,} in G,, anda 
sequence {s,} , $0, such that p;,(s,)—>1. Now if ¢>0, and if v is so large 
that s, 


(11.7) = Pelt — + Pink (Sr) Pest — 5»). 


(11.6) 0 (1 — n). 


If 7EG., pi,(t)=0. If 7EG., the sum on the right is at most 


DX = 1 — 0. 


Then (11.7) implies 


for all j, t. Thus (under hypothesis H.) condition C(B) is satisfied if (and, as is 
easily seen, only if) no sequence of distinct rows (whose elements have first sub- 
scripts in Gq) converges, element by element to the Bth row, for all t. An analogous 
but less elegant form of C*(B, 7) can be obtained. 

The following theorem makes Theorem 10 more precise. 


| 
oh 


1942] THEORY OF MARKOFF CHAINS 59 


THEOREM 11. Suppose that hypothesis H, is true. Let r be any positive num- 
ber, and suppose that R is any denumerable set having r as a limit point. Then 


a(r) — x(r) 
im = 
ror 1 + x(r)? 


with probability 1. If pag(r)>0, then lim,., x(r)=B whenever x(r)=B, (with 
probability 1) if and only if qg< ©. If C(B) or C*(B, r) is satisfied, then qg< @. 


Since, as we have seen in Theorem 8, if r—r, x,(w)—»x,(w) in measure, that 
is, x(r)—>x(r) in measure, it is impossible that | x(r)| — © with positive proba- 
bility along any sequence of r-values approaching r. Therefore, neglecting 0 
probabilities (2*-sets of measure 0), (11.9) implies that x(r) always has x(r) 
as a limiting value when r—+, (the only finite limiting value) but x(r) may 
also have + © as a limiting value. In the course of the proof of Theorem 10, 
we have already proved that if p.s(7) >0, lim,., x(r)=8 whenever x(r) =8 
with probability 1, if and only if gg< ©: in fact this statement follows di- 
rectly from our evaluation of ¢s(4). To prove (11.9) we shall use the families 
of chance variables {y,}, {y*} introduced above. Since these families have 
no fixed discontinuities, 


(11.10) pos — 1) = paw — 7), 


(11.9) (r € R) 


with probability 1. Let A be an w-set of measure 1, such that the following 
conditions are satisfied, if x(¢#)GA: 

(a) (11.10) is true for all 7 and rational T>r; 

(b) if pa;(r)=0, then x(r) 47; if pa;(r) =0, then x(r) ¥j. 

Now suppose that xo(¢)GA, and suppose that xo(r)=8. Suppose that 
lim inf,_, | xo(r) | <+ ©. Then there is an integer y such that xo(r) =¥ for in- 
finitely many values of r, as r—r. Condition (a) implies 


(11.11) lim — 1) = — 1) = — 7), 


for all j and rational T>r. (We are using the fact that because of condition 
(b), Pay(t) #0, so, in accordance with hypothesis H,, p.,(t) is continuous in t.) 
Because of hypothesis H,, lim:.o p7;(t) = 5,;, lims.o paj(t) = 54; for all 7. Thus 
if we let Tr, (11.11) implies that 6,;= 5s; for all 7: y=8. We have now 
proved that the only possible limiting values of xo(r) as r—r are xo(r), + ~; 
(11.9) is true for x(¢) EA, and hence is true with probability 1. If the matrices 
are finite-dimensional, there can be only finite limiting values of xo(r); so 
x(r)—x(r) (rE R) with probability 1, as we have already proved in Theo- 
rem 10. Now suppose that (;;(¢)) is infinite-dimensional, and suppose that 
for some xo(¢) in A, xo(r) does not approach xo(r) =8 (rE R). Then there must 
be a sequence of integers {i,} and a sequence {r,} such that xo(r,) =i, > + ©, 
(r,—7). Then (11.10) becomes 


A 

| 


60 J. L. DOOB 3 [July 


(11.12) lim pi,(T — = — 7) 


for all j and rational JT >r. It follows readily from (11.12) with 7 =6 that there 
is a subsequence {j,} = {i,,} of {i,} and a sequence of values {7,} of T, 
T, | r, such that 


(11.13) lim — = lim — 7) = 1. 


This evidently contradicts condition C(8). Thus if C(G) is satisfied, gs< ©. 
The family of chance variables 


— 7) 


has, as we have seen, the property E*. Since these chance variables are non- 
negative, there is convergence when r]|7, with probability 1(%). Since 


_x(r)—x(7) in measure, 


(11.14) ten 
tir Pazir)(r) Pap(T) 


almost everywhere where x(r) =8. We can suppose A has been chosen so that 
(11.14) is true if x(¢#)E A. Unless x(r) whenever x(¢) EA, r | r, and x(r) =8, 
there is an xo(t) in A with xo(r) =8, a sequence of integers {i,} , and a sequence 
{r,} such that xo(r,) =i, ++ ©, 7, | 7. Then, using (11.14), 


— 7) 1 
Pai,(1) pap(7) 


which contradicts C*(8, r). Thus if C*(8, 7) is satisfied, gg< ©. (We are using 
here the fact which is implicit in the discussion of ¢g(h) above that if x(r) =, 
lim,,,x«(r) =8 with probability 1 if and only if gs< @.) 

We shall need a somewhat stronger condition than C*(8, r) below. We 
shall say that condition C**(8, 7) is satisfied if p.(r) > 0 and if there are posi- 
tive numbers 7, 6 such that if 0<sigs2< 6, +51) >0, then 


(11.16) pai(s2) < (1 — 
bap(7) 


Condition C**(8,7) is always satisfied in the finite-dimensional case, under hy- 
pothesis H,, if BEG, since (11.16) becomes (11.6) when s; and sz approach 0. 


(11.15) 


() This is true if r | + along any sequence of values (Doob, these Transactions, vol. 47 
(1940), p. 460, Theorem 1.3) and this means the truth of the statement when r | r,r ER (Doob, 
these Transactions, vol, 42 (1937), p. 111, Theorem 1.3, or, in another formulation, Duke 
Mathematical Journal, vol. 4 (1938), pp. 758-759, Lemma 2). 


Vora 
4 
q 


1942] THEORY OF MARKOFF CHAINS 61 

THEOREM 12. Suppose that hypothesis H, is true. Let R be any denumerable 
everywhere dense t-set. Then there is a set A of functions x(t), of probability 1, 
such that if x(t)GA the following statements are true. 

(a) If the matrices are finite-dimensional, x(r) (rE R) has only isolated jumps 
as discontinuities. 

(b) Either lim, \r | x(r)| = «© (rCR), or there is an integer B, depending on 
the function x(t) and on 7, such that 


a(r) — 
m = 0 
rhe 1 +| x(r) |? 


If condition C(B) is satisfied, then either lim,,, |x(r)| = © or there is an in- 
teger B depending on the function x(t) and on 7, such that lim,,, x(r)=B (rE R). 

(b*) The statement of (b) remains true with r 1 7, instead of r | 7, replacing 
condition C(B) by C**(B, 7). 

(c) For each 7, there will be an integer B (and B=x(r)), as described in (b), 
(b*), with probability 1. 


Theorem 12 is closely related to work of Doeblin and Feller, with which 
it will be compared below. 

The families {y,}, {y*#} have the properties €, €*, respectively. There is 
therefore an w-set A of probability 1, such that if x(#)GA, the correspond- 
ing y:, (y#*) coincide on R with functions everywhere continuous on the right 
(left), for all j7, rational 7, T*. (It has been proved above that there are no 
points of fixed discontinuity.) We can also suppose that x(r)#j unless 
>0, if x(t)E A. Then if xo(t)EA, if T>r>T*>0, T, T* rational, 
the following limits exist: 


(12.2) P (T 


(12.1) (rE R). 


Piegry(¢ — T*) 


If xo(r) takes on a subscript 8 for values of r approaching + from above, we 
can evaluate the limit in (12.2): 


(12.3) — 1) = — = — 7). 


(12.2*) 


Then 8 is uniquely determined, for if y had the same defining property, we 
should have pg;,(T —r) = p,;(T —7) for all rational T>r, where 8, y are both 
in Gz. When, T | r this means (using hypothesis H.) that 5,;= 5,; for all j, 
impossible unless B=. Thus (12.1) is true. Then in the finite-dimensional 
case, lim,,, xo(r)=8, that is, xo(r)=6 for r sufficiently near r (r>7). In 
the infinite-dimensional case, unless lim,,, xo(r)=8, we have proved that 
lim sup,,, |xo(r)| = ©, and the method of proof of Theorem 11 can be car- 


= 


62 J. L. DOOB (July 


ried through to find a contradiction to condition C(8). We have now proved 
that Theorem 12(b) is true, supposing however that there is a subscript 8 
as described. (In the finite-dimensional case there is always such a 8.) In the 
infinite-dimensional case, if there is no such 8, lim, ,, | x(r)| = ©. This finishes 
the proof of (b). The discussion when r { 7 is carried on in the same way, using 
the existence of the limit in (12.2*). Theorem 12(a) is now obviously true. 
For a given 7, x(r)—>x(r) in measure, when r—r, so there will be an integer 8 
as described above, and 8=«x(r), with probability 1. 

As usual in this sort of discussion, instead of saying that x(r) (rE R) has 
the above described properties with probabiliy 1, we could say that if a 
space {2 of a stochastic process is chosen properly, all the x(¢) in Q will have 
the above properties, where ¢ ranges through all values. 

Doeblin has considered a general Markoff process in which the transition 
probability of going from state i at time ¢ to state j at time ¢’ is not supposed 
necessarily to be a function of ¢t’—#, and in which it is not supposed that the 
number of possible states is denumerably infinite. His hypotheses, when 
translated into our notation, and simplified because of the more special proc- 
ess being considered here, become 


(12.4) lim = 1 


uniformly in 7. This hypothesis, combined with the hypothesis that the proc- 
ess is initially in state a is considerably stronger than hypothesis H, (except 
in the finite-dimensional case, when, assuming hypothesis H,, Doeblin’s con- 
dition is always applicable) and evidently also implies condition C(8) for all 
BEG,. Doeblin showed that under his hypotheses, and assuming some given 
initial state, neglecting an w-set of measure 0, x(r) (rE R) has only isolated 
jumps as discontinuities(?’). 

Conversely, suppose that the process is initially in state a, and that the 
w-measure has the property that x(r) ("GE R, a denumerable everywhere dense 
t-set) has only isolated jumps as discontinuities, with probability 1. Theorem 
11 shows that in this case there must be continuity at each fixed 7, with proba- 
bility 1, that is, © if 1€©G,. Also, by Theorem 10, Let P(t) 
be the probability(?*) that if x(r)=i then x(r+#)=j and x(r) has » jumps 
in going from i to j, between r and r++. It is easily verified that if i1GG. 


(2) = 
Pi (t) = =f PS (s)q ine (n = 0), 
7 


(#7) Skandinavisk Aktuarietidskrift, vol. 22 (1939), pp. 211-222. 
(#8) For this conditional probability to have a meaning we must suppose that pas(r) >0; 
7 can always be so chosen, if i EG Ga. 


{ 
i 
| 


1942] THEORY OF MARKOFF CHAINS 


and obviously 


(12.6) pul) 
neo 

Moreover if we suppose only hypothesis H, to be true, and that >. jqi;=qi < &, 
(t€G,), then if 1€G,, considerations analogous to those used in the proof of 
Theorem 10 show that PY (t) as defined in (12.5) will have the probability 
meaning described above. On the other hand, (12.6) is now true if and only 
if the only discontinuities of x(r) are isolated jumps, with probability 1. Feller 
has found necessary and sufficient conditions on the q;, qi; that (12.6) be 
true(?*), The above remarks give a complete justification for Feller’s proba- 
bility interpretation (ibid., p. 498) of the PY (t). (He did not need this inter- 
pretation in his proofs.) The details given in Theorem 12 on the character of 
x(t) at a discontinuity which is: not a jump round out Feller’s description, 
arising from an entirely different background (ibid., pp. 512-513). 

Suppose again that hypothesis H, is satisfied and that Laqu= =q;< © for 
all 1€G,. The differential equations 


(13.1) = — + (i € Ga) 


satisfied by the ;;(¢), due to Kolmogoroff(**) are well known, but their inti- 
mate relation to the continuity properties of the x(t) seems less well known. 
In fact, under the above hypotheses, we have shown (Theorem 10) that (with 
probability 1)-if x(r)=i€G, there is a first discontinuity of x(r) (rE R, an 
everywhere dense denumerable /-set) after 7, an isolated jump. Then the prob- 
ability of going from 7 to k is the sum of the probabilities of going from 4 to j 
on the first jump, and then to k (summed over 7). Considerations analogous 
to those used in the proof of Theorem 10 now show that, evaluating the above 
probabilities, 


(13.2) pir(t) = >. + 

ist 
The equations of (13.1) are obtained by differentiating those of (13.2), in 
which the series can obviously be differentiated term by term. The second 
set of differential equations obtained by Kolmogoroff 


(13.3) pielt) = — + pislgin (i € Ga) 
(?*) These Transactions, vol. 48 (1940), pp. 506-507. Feller’s results are applicable to con- 
siderably more general stochastic processes than those considered here. 
(°) Mathematische Annalen, vol. 104 (1931), p. 429. Kolmogoroff imposed further restric- 
tions on the #;;(¢). Feller (op. cit., p. 495) obtained (13.1) with substantially our hypotheses 
given above. 


63 
1 


64 J. L. DOOB 


does not seem always to be true without further hypotheses. We shall show 
that the truth of (13.3) is equivalent to the imposition of certain regularity 
properties on the x(#). The probability 


Pir(te) — (i # k, te > th) 


is at least equal to the probability that if x(r—#) =7, then x(r) goes to j at 
some point between r—#, and 7, when x(r) jumps to k, remaining at & until 
summed over Thus 


ty 
Moreover there is equality if and only if when x(r) =i, there is a last discon- 
tinuity of x(r) before r, which is a jump, with probability 1. Dividing (13.4) 
by #—4 and letting #,—¢:,—+0 we obtain 


(13.5) pin(t) > — qupia(t) + X (i Ga). 


It is easily verified that (13.5) also follows directly from (0.1). Since there 
is equality in (13.5) if and only if there is equality in (13.4), we have obtained 
the following theorem. 


THEOREM 13. Suppose that hypothesis Ha is satisfied and that >, qij=9i< © 
for all iG Ga. Then (13.1) is always true; (13.3) is true (for all t) if and only if 
when x(r) =1, there is, with probability 1, a last discontinuity of x(r) before r, 
which is an isolated discontinuity (a jump). 


It is interesting to note that if J;;(t) is the probability that if x(r) =7 then 
x(r+#) =j and the transition from i to j(*) is accomplished in a finite number 
of isolated jumps, then $,;(¢) evidently satisfies (0.1) except that >> ,5,;(¢) may 
be less than 1. Moreover (13.1) is also true for the §;,(t) since the derivation 
for the p;,(t) applies equally well to §;,(¢). And the derivation we have given 
of (13.5), when applied to the §;;(¢) actually gives equality: (13.3) is true of 
the J,;(¢) in all cases. The latter fact was also proved by Feller. 


(*) Strictly speaking, we should restrict 4, j to lie in Ga. 


UNIVERSITY OF ILLINOIS, 
UrRsBana, ILL. 


| 

ne 


ON CONVERSE GAP THEOREMS 


BY 
GEORGE POLYA 


1. In what follows, I consider power series with preassigned vanishing co- 
efficients. I write such a series in the form 


(1) + ago + + 
The numbers a, are different from 0 and the X, are integers, 


I assume that the radius of convergence of the series (1) is finite and differ- 
ent from 0. 
I quote two well known theorems('). 


THEOREM I. Jf 
(3) lim md, = 0 


no 


the domain of existence of the analytic function defined by (1) is the interior of 
the circle of convergence of the series (1). 


THEOREM II. If 
(4) lim inf md, = 0 


the domain of existence of the analytic function defined by (1) is a simply con- 
nected part of the z-plane (from which it follows that the function defined by (1) 
ts uniform). 


Is it possible to improve these theorems by enlarging the hypothesis? 
Is there a less exacting hypothesis leading to the same conclusion? I say no. 
In fact I shall solve the following problems(?). 


Presented to the Society, February 22, 1941; received by the editors June 4, 1941. 

(*) In the following the two parts of my paper Untersuchungen tiber Liicken und Singu- 
laritdéten von Potenzreihen, Mathematische Zeitschrift, vol. 29 (1929), pp. 549-640 and Annals 
of Mathematics, (2), vol. 34 (1933), pp. 731-777 will be quoted as LS I and LS II. For Theorem 
I (Fabry’s theorem) see LS I, p. 627, Theorem VIa; for Theorem II (theorem of the present 
author) see LS II, p. 737, Theorem B. 

(?) These problems have been stated, with an indication of the proof: I in Comptes Rendus 
de l’Académie des Sciences, Paris, vol. 208 (1939), pp. 709-711; II in the Bulletin of the Ameri- 
can Mathematical Society, vol. 47 (1941), p. 207. A problem related to I was stated by G. Szegé, 
Acta Litterarum ac Scientiarum Szeged, vol. 1 (1923), p. 73. 


65 


& 
| 
| 
| 


66 GEORGE POLYA [July 


PROBLEM I. Given an infinite sequence of integers Xi, 2, +++ ,Xn,*** Satis- 
Sying (2) but not satisfying (3), find a power series of the form (1) defining an 
analytic function whose domain of existence extends beyond the circle of conver- 
gence. 


PROBLEM II. Given an infinite sequence of integers \1,\2,° * Satis- 
Sying (2) but not satisfying (4), find a power series of the form (1) defining a 
multiform analytic function (whose domain of existence, consequently, cannot be 
a simply connected part of the z-plane). 


2. It is of some interest to restate the facts which we proposed to prove 
in a different terminology. This terminology is due to Borel but, so far as I 
know, it was used by him just once(*) and it has never been used since. 

Let us consider power series whose radius of convergence is different from 
0 and ~ and let us say that two such series 


Lo”, 
0 0 
belong to the same class, if they have the same distribution of nonvanishing 
coefficients, that is if, for m=0, 1, 2,---,c,andc, are either both 0 or both 
different from 0. In fact, a class of power series is characterized by those 
powers of z whose coefficients do not vanish, and therefore by an increasing 
sequence of integers \1, Ax, - - - and all series belonging to the class have the 
same form (1)(*). Let us call 


lim  liminfmd,, limsupmd,- 
n— 


(in the given order) the density, the lower density and the upper density of 
the class (the first may not exist, but the second and the third necessarily 
exist). With this terminology, we may compound Theorem I and Problem I 
into one short statement (and the same for Theorem II and Problem II): 


I. In order that all power series of a class have their circle of convergence as 
natural boundary it is necessary and sufficient that the density of the class be 0. 


II. In order that all power series of a class define uniform analytic functions 
it 1s necessary and sufficient that the lower density of the class be 0. 


A few other well known facts on the singularities of power series may be 
quite elegantly stated in the same terminology(*). 


(?) Comptes Rendus de !’Académie des Sciences, Paris, vol. 137 (1903), pp. 695-697. 

(*) Series (1) corresponds to the (evidently unessential) assumption that the coefficient 
of 2° is zero in all series of the class. 

(*) See also, for further literature, LS I, p. 622, Theorem IIIa for the first, and LS II, p. 745, 
Theorem IV for the third statement. The second statement is easy. 


| 

5 
ia 

i} 


1942] CONVERSE GAP THEOREMS 67 


Each class contains non-continuable power series. 

In order that a class contain a power series having on its circle of convergence 
a pole and no other singular point, it is necessary and sufficient, that the sequence 
M1, Ae, + ++ contain nearly all integers, that is, either all integers or all with a 
finite number of exceptions. 

In order that a class contain a power series having on its circle of convergence 
an essential singular point (an isolated singularity which is not a pole and not 
a branch point) and no other singular point, it is necessary and sufficient that 
the density of the class exist and be 1. 


These theorems seem to suggest that there are a few more of the same 
kind. 

3. Our solution of the Problems I and II uses some properties of the series 
(5) + F(2)s* + +++ + + = 
where F(z) denotes an entire function of exponential type(*). The real-valued 
periodic function h(¢) of the real variable ¢, defined by 


(6) h(o) = lim sup r log | F(re*) | 


is called the indicator of F(z). The series (5) has a finite radius of convergence 


and its analytic continuation is closely connected with the indicator h(). I 
quote the following fact(’): 


If 
(7) h(— 2/2) < =, h(x/2) 


then the function ©(z) defined by the series (5) is regular along the negative real 
axis, and also at the point z= © in whose neighborhood it has the development 


(8) #(z) = — F(0) — 


This theorem yields a quick solution of Problem II. In fact, assume that 


the sequence of integers \1, \2,--~- does not satisfy (4). Then there exists a 
positive a, such that 


(9) 
Define 
(10) G(z) = Il 1— 


n=l : 


> a. 


n 
Xn 


(®) Defined in LS I, p. 578. 
(7) See LS I, pp. 604-609, especially formulae (72) and (73), p. 609. Observe that the #(z) 
of these formulae differs by an additive constant from the #(z) of the present formula (5). 


68 GEORGE POLYA 


It follows from (9), (10) that, for r>0, 


(11). G(ir) = II (1 + > ll 1+ sin e 
nl n? 


trar 2rar 


Define F(z) by the equation 
(12) w2F (2)G(s) = sin 


F(z) is evidently an entire function of exponential type(*). For positive in- 
tegral m 
— 1)*/(AnG’(An)) = 
0 if 
(14) F(0) = 1, 
and, by virtue of (11) and (12), 


(15) lim sup r— log | F(+ ir)| S 7 — wa. 


Consider the power series, arising from (5), 


(16) f = 
0 


Series (16) is evidently of form (1). The condition (7) is fulfilled, see (6) 
and (15). Therefore, the analytic continuation of (16) along the negative real 
axis is possible. But, in a certain neighborhood of the point z= ~, we have, 
by virtue of (8) and (14), with a certain constant C, 


F(— 


22? 


f = C — + 
0 


and therefore the analytic continuation of (16) is not a single-valued function. 
Thus (16) fulfills all the requirements of Problem II. 

4. Now we are going to solve Problem I. We use again the series (5) and 
the connection between the analytic continuation of this series and the indi- 
cator h(¢). We need now the following facts(*). 


To each entire function F(z) of exponential type corresponds a bounded and 
closed convex domain 5, called the indicator diagram of F(z). The domain 3 lies 
in the half-plane 


«cos + ysing — £0 


(*) In fact, (24) holds also for the present F(z). 

(*) See LS I, pp. 604-609, especially p. 606. (Correct the misprint at the end of the four- 
teenth line from the top of p. 606; read h(0) instead of 0.) I write as usual s=x-+-4y with real x, y 
in the following statement and later in (26), and (27). 


[July 

} 
4 

: 
4 

A 

4 


1942] CONVERSE GAP THEOREMS 69 


but has a point on the boundary of this half-plane; this holds for all real o. If the 
circle of convergence of the power series (5) is a natural boundary, the boundary 
of the indicator diagram 3 of F(z) contains a segment of a vertical straight line, 
of length not less than 27, limiting 3 from the right. 


We shall use this fact at the end of the following construction which we 
divide in successive steps. 

i. We start from a sequence Ai, Ae, - - - that does not satisfy (3). It follows 
that, for a fixed a, 0<a@ <1, there are arbitrarily distant intervals of the form 
ra<x<r which contain more than ré points of the sequence Ai, Az, 6 
being positive, sufficiently small, but fixed(!°). Thus we can choose a sequence 
of increasing positive integers m1, m2,--- and a positive 5, satisfying the fol- 
lowing condition: 

(I) The interval between (m,—1/2)/2'/? and n,—1/2, which I denote by 
contains at least points of the sequence Ax, Az, - - 

By rejecting if necessary certain elements of the chosen sequence, we can 
find a sequence (a subsequence of the first chosen sequence, which, by an ap- 
propriate change of notation, will be called again m, mz, - - - ) satisfying not 
only condition (I) but also the following: 


< (me — $)/2*/?, 


In fact, m4-1 being fixed, both inequalities (II) are satisfied by any suffi- 
ciently great n,. 
Call those elements of the sequence Ax, Az, - - - which are contained in the 


intervals I2, - ++ ,numbered in order of magnitude, jis, 
ii. Using the sequence pi, we, --* we just constructed, define 
(17) G(z) = [] (1 - =). 
n=l 


This definition is different from (10), which we used in solving Problem II, 
and which we disregard now. We shall estimate 


(me log | 3) | = > (me 3)-1 log | *) 1| 
Bj 
=5,+52.+ Ss. 


S: contains those terms of the sum dn the right-hand side of the first line 
whose yu; is contained in J,; S: those terms whose y; is contained in one of 


(18) 


() The easy proof can be left to the reader; it is contained in the fuller developments of 
LS I, pp. 556-560, especially in (14) p. 559. 


log 


70 GEORGE POLYA {July 


the intervals J;, I2,---, Ix-1; Ss those whose y; is in one of the intervals 
Tus1, Inga, + + + . These intervals do not overlap, by the first inequality (II). If 
bu; is in S; then uj;<,-1—1/2 and therefore, by the second inequality (II), 


(19) Si < (m — 0 


as n,— ©. If w;is in S; then u;>n”,—1/2 and therefore each term in S; is 
negative, 


(20) 5S; < 0. 


The terms in S: are also negative, and in number not less than 1,5, by 
condition (I); the single term decreases algebraically (increases in absolute 
value) as yu; increases. We obtain the (algebraically) greatest value of S: by 
taking as few terms as possible and terms as close to the left-hand end point of 
I, as possible. If the summation is extended to the integers / satisfying. 


— <1 < (me — + 


we have 


Ss < — | 


(1/202)-+8 1 
log (— - 1) 
By (18), (19), (20), (21) we obtain, running through the positive integers, 


that 
(22) lim inf — 4)-! log | G(n — 3)| <0. 


(21) 


iii. Define F(z) by (12). From (12) and (22) it follows that 
h(0) = lim sup r log | F(r) | 
(23) 


= lim sup (m — 4) log 
no 


1 
0. 


On the other hand, if | z| =r, 
(24) | F(@)| < TI (1 “) 
n=l n 
and therefore, by (6), for all real @ 
(25) h(¢) S 


Inequalities for h(@) are equivalent to geometric conditions for 5, the indica- 
tor diagram of F(z). In the present case, F(z) is an even function and real- 


“ 
| 

i 

. 


1942] CONVERSE GAP THEOREMS 71 
valued for real 2; so 3 is symmetrical with respect to the real and to the imagi- 
nary axes. By (25), 3 is contained in the circle 

(26) x? + y? = x, 

By (23), it has a common point with the vertical line 

(27) x = h(0) 


and no point with an abscissa greater then h(0). But the segment, intercepted 
on the line (27) by the circle (26) is shorter than 27, because, by (23), h(0) >0; 
there is no segment of a vertical line, of length greater than or equal to 2m, on the 
boundary of the indicator diagram 3. 

iv. This last result shows that the series (5) 


(28) = F(m)z™ = 2, 
~ ~ (un) 
is continuable, by virtue of the fact quoted at the beginning of this section. 
Series (28) is not exactly of the form (1), because some of the coefficients a, 
may be zero, if isa proper subsequence of Ae, - - and not iden- 
tical with the latter. If the coefficient of z** in (28) is 0, add the term 


An! 


to (28). The series obtained in this way is exactly of the form (1) and satisfies 
all requirements of Problem I. 


Brown UNIVERSITY, 
PROVIDENCE, R. I. 


= 
= 


NORLUND SUMMABILITY OF DOUBLE FOURIER SERIES 


BY 
JOHN G. HERRIOT 


1. Introduction. Throughout this paper the function f(t, u) is assumed to 
to be Lebesgue integrable over the square Q (—7, 7; —7, m) and to have 
period 27 in each variable. The double Fourier series of f is denoted by o(f) 
and the rectangular partial sums of o(f) are denoted by Smn(x, y; f). To say 
that a method of summability S possesses the localization property means 
that if f vanishes in a neighborhood of (x, y) then S sums a(f) at (x, y) to 0. 
It is well known that the Cesaro method (C, 1, 1), for example, does not 
possess the localization property. G. Griinwald [2](?) has shown that at any 
point (x, y) of continuity of f the square partial sums s,,(x, y; f) are sum- 
mable (C, 1) to f(x, y). Thus (C, 1) applied to the square partial sums pos- 
sesses the localization property. We show in §5 that this is the best possible 
result. 

In this paper we shall apply Nérlund means to o(f). To define the Nérlund 
mean of {San(x, y;f)} let be any sequence of constants. Let P, =) 
#0. The Nérlund mean is 


1 n 
(1.01) ta(x, f) = y;f). 


If t,(x, y;f) tends to a limit as n— © the sequence { San (x, y;f)} is said to be 
summable N, to this limit. We shall consider only regular Nérlund methods 
of summability. The conditions of regularity for N, are(?) 


(1.02) = Pal), as no, 
k=O 


Cesaro (C, a), a>0, is clearly a regular Nérlund method. 

We shall also consider a double Nérlund transform of } Smna(x, y; f) } . Let 
{p®} (k=1, 2) be two sequences of constants. Let PY = * of} #0. Then 
the double Nérlund transform is 


1 (a) 
(1.03) tmn(x, ¥;f) = Pope Pm f). 
We shall restrict the manner in which m, n—. If, for any X21, tma(x, 9; f) 
tends to a limit when m, n— © in such a manner that m/n SX, n/m Si, this 


Presented to the Society, September 11, 1940; received by the editors May 26, 1941. 
(4) The numbers in square brackets refer to the bibliography at the end of the paper. 
(?) See, for example, Hille and Tamarkin [4, p. 758]. 


72 


. 
| 
4 
fe 
‘Be 


DOUBLE FOURIER SERIES 73 


limit being independent of X, then o(f) is said to be restrictedly summable N, 
at (x, y) to this limit. (C, a, 8) is clearly a double Nérlund method. 

In §§5 and 6 of this paper local conditions are imposed on the function 
whose double Fourier series is under consideration in order to discover which 
of these methods of summability possess the localization property and which 
do not. In §§7 to 11 methods of summability which sum o(f) almost every- 
where to f are studied. Theorem 5 is a generalization of and includes the re- 
sult of Marcinkiewicz and Zygmund [6]. When the present paper had been 
prepared for publication the author received a copy of a paper just published 
by Griinwald [3] in which it was shown that the sequence {5na(x, y;f)} is 
summable (C, 1) almost everywhere to f(x, y). However, by Corollary 6.1 of 
the present paper, this result is true also for (C, a), a>0. Both Corollary 6.1 
and Theorem 6 from which it follows were established several months before 
the appearance of Griinwald’s paper. Indeed the result of Corollary 6.1 was 
known much earlier, for, on reading the proofs of a paper of Marcinkiewicz 
[5] in which it was shown that the sequence { San(x, if) } is summable (C, 2) 
almost everywhere to f(x, y), Zygmund pointed out that the result could be 
extended to (C, a), a>0. But Marcinkiewicz did not wish to change his paper 
and so the result was not published. 

2. Basic formulas. The following notation will be employed throughout 
this paper. Let 


= t+ f(x—t y+) 
+ f(x — t, y — 4f(x, y). 


It is well known that 


(2.01) 


1 
(2.02) Smn(X, ¥;f) = f f f(x +t, + 


where D,,(¢) denotes the Dirichlet kernel. Then 


(2.03) x, yf) -{ +t, y+ u)K,(t, u)dtdu 


where 


n 


1 
Kall, w) = 
n 


sin (t+ sin (+ 


4 sin $¢ sin 4u 


(2.04) 


Clearly K,(t, u) is an even-even function of ¢ and u and 


fx u)dtdu = 1. 


\ 


74 J. G. HERRIOT 


It follows that 
2.05 n\X, J) = K,(t, tdu. 


In order to obtain alternative forms for K,(t, ~) we set 


(2.06) Balt) = = py cos kt + pe sin kt = + 
k= i 


Now sin (k+4)ésin (k+4)u = —}[cos (k+4)(t+u) —cos (k+4)(¢—x) ] and 


cos (k + $)(¢ + u) = py cos (mn — k + 4)(t + u) 


k=O k=O 
= €,(¢ + u) cos (nm + 4)(¢ + u) 


+ S,(¢ + u) sin (mn + 4)(¢ + u). 
Substituting in (2.04) we have 


K,(t, u) = — (84*P, sin 3 sin $u)-"{€,(¢ + u) cos (m + 4)(¢ + u) 
(2.07) + S,(¢ + u) sin (nm + 4)(¢ + u)— €,(¢ — u) cos (n+ 4)(t — 
— ©,(¢ — u) sin (n + 4)(t — u)}. 
If we apply the mean value theorem to this we obtain 
K,(t, u) = — u(4x*P, sin sin 4u)-*{ — ©,(E:)-(m + 4) sin (n + 
(2.08) + (&:) cos (m + + (m + 4) cos (m + 
+ Sy (és) sin (m + 
Forming the double Nérlund transform of Smn(x, y; f) we have 


where 


’ k = 1, 2. 


(k) 
(k) i 1 pa; sin (j + 4)t 
2.10) N, (4) = n-jD;(t) = 


Thus NV\#) is an even function of ¢ and 


f Ns (dt = 1, 
We easily deduce that 


(1) (2) 


(2.11) taal, — 9) f f (whdtdu. 


[July 
n n 
| 
| 
| 
| 
| 
ae 


1942] DOUBLE FOURIER SERIES 75 


Defining B%(t), C(t), S(t) analogously to (2.06) and proceeding as in 
the deduction of (2.07) we obtain 


(k) 


(2.12) = (24 Ps” sin sin + — SS cos (m + 
k = 1,2. 


3. Estimates of the kernels. We require estimates for K,(t, u) and N®(t). 
We shall assume throughout this section that the sequences {p,} and {p®} 
(k=1, 2) satisfy (1.02) and that n|p,|=O(|P,|), =0(| Pa”|). All 
{pn} and {p%} used in our theorems satisfy these conditions. 

Since | Di =k+1/2, it follows from (2.04) that(*) 


(3.01) | Kn(t, «) | An?, n = 1, all #, u. 
Also from (2.04) we have 

(3.02) | Ka(t, A/t, 0<t,usr. 
In the same way from (2.10) we obtain 


(3.03) | we) | < An, allt, k= 1,2. 


In order to obtain further estimates for the kernels we need to estimate 


$,(t) and (t). We put 

(3.04) | = = Va =>. | pe — Pearls 
ken 

and introduce the step functions 

(3.05) r(u) = R(u) = V(u) = 


where [u] as usual denotes the largest integer less than or equal to u. Let us 
note that by (1.02) 


Proceeding as on p. 768 of the paper of Hille and Tamarkin [4] and noting 
that ¢-'r(1/t) SA R(1/t) we have 


If we set 
d 
(3.08) = jen Devt, 
i=0 dt j=0 


(?) Here and in the sequel the letter A denotes an absolute constant. The constant need 
not be the same at every occurrence. 


| n | n n 


16 J. G. HERRIOT (July 


then for 1/n StS 32/2 we have | sAn (k=0, 1, 2,-+-+,m). Using this 
fact and proceeding as in proving (3.07) we get 


(3.09) |B An{R(—) v(—)]}, 


Then for ¢, u>0, 1/nSt+us37/2, 21/n we have 


| Ka(t, u)| teal 


| Kn(t, u)| < u) 


Relation (3.10) follows from (2.07) and (3.07); (3.11) follows from (2.08), 
(3.07) and (3.09) if we note that K,(¢, u)=K,(u, ¢) and that t+ 2t in case 
t—u21/n. 

Analogously to (3.04) and (3.05) we can define 7, R®, V®, r®(u), 
R®™(u), V™(u) and obtain an estimate for | p(¢)| similar to (3.07). Then 
from (2.12) we have 


(3.10) 


(3.11) 


where 


()}, = 1,2, 


1 1 1 
Mart) = (-), Mar () =r 


t 
(k) 1 (k) crf 1 
Mas (4) = —V (—)} k = 1,2. 


Estimating $,(#) and $®(#) as on p. 767 of the paper of Hille and 
Tamarkin [4] we obtain from (2.07) and (2.12), respectively, 
(3.14) | Ka(t, S {A(®)/Ra} {Va + ra}, 
28, 
(k) (k) (k) (k) 
(3.15) |Nn +r}, =1,2, 


where A (5) depends only on 6. 
Finally we consider the (C, 1) kérnel K}(¢, u) which is a special case 
of K,(t, u) when p,=1. In case n21, OSuSzx/2 or OStS7/2, 


(3.13) 


|.) 

|| 

4 


1942] DOUBLE FOURIER SERIES 


«/2u Sn we shall show that 


A 2 
| Ki¢t, < 


+ An? 

(1 + m2] ¢— + + 
the positive square root being taken in all cases. Since p, =1, we have r,=1, 
P,=R,=n+1, V,=0 (n=0,1,2,--- ). Then, from (3.01), (3.02), (3.10) and 
(3.11) we have, respectively, 

(3.17) | Ka(t, An, n= 1, all ¢, u, 
(3.18). Ka(t, | A/tm, 
A A 
ntu(t + ntu | t- u| 

A 
t— u| (t+ u) 
Let D; be the part of the domain under consideration in which ¢S$2/n, uS2/n, 
D, that part in which ¢>2/n, uS1/n, D, the part in which OS¢—uSi1/n, 
t>2/n, u>1/n, Ds, the part in which t>2/n, 1/n<ust/2, Ds the part in 
which ¢>2/n, t/2<u<t—1/n, and D3, Ds, Dz, Dy the domains symmetric to 
D2, Ds, Do, Ds, respectively. Then (3.16) follows from (3.17) in D,, from (3.20) 
in Ds, from (3.18) in Dy, and from (3.19) in Dg and Dg. It follows in Ds, Ds 
D:, Dy by symmetry. Thus (3.16) is completely established. 


4. Preliminary lemmas. The following lemmas concerning the Nérlund 
coefficients p, and P, will be useful. 


LemMA 1. Jf =O(|Pn|), then n|pn| =O(|P.|) and 
=O(| Pal). 

2. If = O(| P|), thenn=O(|P,|) and P| /k 
=0(|P,|). 


It is clear that the hypothesis of Lemma 2 implies that of Lemma 1. These 
lemmas follow easily from the relations 


(3.16) 


(3.19) | Ka(t,«)|< 


1 3r 
—sitit+us—) 
n 2 


1 


k 
Pr = (k+1)pe+D — 


jmk+1 


We may also easi:y establish the following analogue of Abel’s partial sum 
formula. 


77 


78 J. G. HERRIOT 


Lemna 3. Let {ay}, {bj} be two sequences. Let 
A104 jk = — je = — jx = jn. 


Similarly define jx, Audi. Then 


j=c, kad j=c, kad 


kad 
n 
kad 


5. Local results making use of square partial sums. Our first theorem’ex- 
tends the result of Griinwald [2] in two directions and also includes his result. 


THEOREM 1. Let N, be a regular Nérlund method of summability satisfying 
the condition 


n—1 


(5.01) (n — k)| pe — = 0(| P,|). 


Then at any point (x, y) such that 


k) = u) | du = o(hk), 


’ du = o( k) 


as h, k—0 simultaneously but independently, the sequence { San (x, fy} is sum- 
mable N, to f(x, y). 


(5.02) 


It should be noted that the second condition on the function at (x, y) is © 
similar to the first. The first is applied to rectangles along the axes, the second 
to rectangles along the bisectors of the angles between the axes. The factors 
2-1/2 are not essential, but are introduced for convenience. 

Proof. A regular Nérlund method N, includes (C, 1) if(*) 


n—1 
(5.03) n| pol + — k)| pe — < A| P|. 
kel 


Hence if N, satisfies (5.01) and is regular, then it includes (C, 1). Thus it 


(*) See, for example, Hille and Tamarkin [4, p. 782]. 


[July 
mn mn m 
m n 
— 
i 
| 
Bie 
¥ 


1942] DOUBLE FOURIER SERIES 79 


suffices to prove the theorem for (C, 1). Let #(x, y; f) denote the (C, 1) trans- 
form of the sequence {San(x, y;f)}. From (2.05) we have 


(5.04) — f(x, y) = u)K.(t, u)dtdu 


where K}(t, u) is given by (2.04) with p, =1. Fix (x, y). 
Given e>0 we can choose 6 such that 0<5<7/4 and such that 
(5.05) | (4, k)|<e| | &*(h, <e| forO<| | $28. 


Suppose n >2/5. Let B;= [0, x; 0, r]—[0, 5; 0, 5]. Then 


Then by (3.16) 


3 3 ‘u) | didu 
JIis Ant J (1 + + 8/248/2) 


| bey(t, u) | dtdu 
i J, [1 + — [1 + + Jut Ju. 


Integrating Ji, twice by parts and applying (5.05) we get 


+ Ante f f ia 
0 (1 + 4 43/24 8/2)2 


= f +f — 
0 (1 + 0 1/n n n*® 1/n yrl2 


Hence, since n§>2, we easily obtain Ju Ae. Applying the transformation 
to Jz and proceeding as above we get 
JusAe. Thus 

Next let B, be that part of B; in which t2 6, ux 8’ < 6/4, Bz the domain 
symmetric to B,, B; that part of B; in which |4—1| s 5’, B, the rest of Bs. 
Clearly B,-B,;=B,-B;=0 since 5’ < 6/4. In Bi+B.+B; we have | Kit, u)| 
A/(35/4)*. This follows from (3.20) in +B, and from (3.18) in Bs. Since 
¢zy(t, u) is integrable we can choose 5’ depending only on 6 (and hence on e), 
0<8’< 5/4, such that 


Jus 


But 


(5.07) f u)Ka(t, u)| dtdu <«. 


80 J. G. HERRIOT ; [July 


Fixing 5’, we see that, on account of (3.14), Ki(t, u) +0 uniformly in By. Thus 
for all sufficiently large n we have J; < 2¢and consequently | th(x, f) —f(x, y)| 
Ae. That is, A(x, y; f) f(x, y) as n— ©. This completes the proof of the 
theorem. 


Coro.iary 1.1. Let N, be a regular Nérlund method of summability satis- 
fying (5.01). Then N, applied to the square partial sums of the double Fourier 
series possesses the localization property. 


For if f vanishes in a neighborhood of (x, y), dzy(t, u) satisfies (5.02). 

Before showing that (5.01) is also partly necessary in order that N, ap- 
plied to the square partial sums should possess the localization property we 
prove the following lemma. 


Lema 4. Let N, be a regular Nérlund method of summability with p,20, 
Pn non-increasing, pi<po, as Suppose 0<b<m. Let 
E=[-7, 7; r]—(—5, 6; —6, 5). Then there exists N>O such that 


(5.08) ess | K,(t, u) | > An/Pra, adin>wN,A>O. 


Proof. From (2.04) we have 


(— 1)* (— 1)* 


(— 1)*kps 
k=O 
= Ji Jo. 
Since p, is non-increasing we have immediately |J:| 2(po—p1)/2m?P». 
If we set Wi =)>03_o(—1)%, then | W,| Sk and we easily get 


n—1 


n 
Dd 1)*kpe| = | Wilde — Pers) + Wapn| S Pe — Puss) + 
k=O k=O 


k=O 


=>» Pa. 
kewl 


Hence | $1/2m?. But we can choose N >0 such that n/P, >2/(po— for 
all Then for we find | K,(z, 0)| 2|J:| —| J2| 
=An/P,, A>0. But E is closed and for each n, K,(t, u) is continuous. Thus 
(5.08) follows. 


THEOREM 2, Let N, bea regular Nérlund method of summability with p,20, 
Pn non-increasing, pi< po, n/P,— as Then there exists f vanishing in 
a neighborhood of (0, 0) such that lim sup, ..« | t,(0, 0;f) | =-+o, 


| 
n 
| 
| 
Bes . 
io 


1942] DOUBLE FOURIER SERIES 81 


Proof. Let E=[—7, x; —x, r]—(—8, 5; —5, 5), Consider the 
class of functions f€L[—z, —x, which vanish in (— 6, 6; — 6, 5), that 
is, the class of functions fE L(Z). Then 


(5.09) 0; ) = f u)Ka(t, u)dtdu = 


defines a linear functional on the space L(Z) with norm 


5.10 T,|| = K,(t, u)|. 
(5.10) {|7.| ess sup | (t, u) | 


Now suppose that the conclusion of our theorem does not hold. Then for 
every fEL(E£), lim sup,.. | T.(f)| < ©. By a well known theorem of Banach 
and Steinhaus($) it follows that ||7,|| << for all n. Thus by (5.10) and 
(5.08) we have An/P, SM, A >0, for all n> N. This contradicts the hypothe- 
sis. 


Coro.iary 2.1. Let N, be a regular Nérlund method of summability with 
PnZ0, pn non-increasing, pi< po. Then (5.01) is necessary as well as sufficient 
for N, applied to the square partial sums of the double Fourier series to possess 
the localization property. 


Proof. To prove the necessity we first note that n/P, is non-decreasing 
since p, is non-increasing. Then in order that N, applied to the square partial 
sums should possess the localization property we must have n/P, bounded 
by Theorem 2. The condition (5.01) is an immediate consequence of this. 

The case in which p,20, p, non-increasing, is especially important as it 
includes Cesaro (C, a), 0<a31. Because of the simplicity of the result in this 
case we state it separately. 


COROLLARY 2.2. Under the hypotheses of Corollary 2.1, a necessary and suffi- 
cient condition that N, applied to the square partial sums of the double Fourier 
series should possess the localization property isn=O(P,). 


From this it follows that (C, a) applied to the square partial sums pos- 
sesses the localization property if and only if a21. Thus Griinwald’s [2] re- 
sult is the best possible in the sense that it cannot be extended to (C,a),a<1. 

6. Local properties of restricted summability. We turn now to restricted 
double Nérlund summability of the rectangular partial sums of the double 
Fourier series. The results are similar to those in §5. 


THEOREM 3. Let N, be a double Nérlund method of summability satisfying 
the conditions 


(6.01) — peal =O(| Pr’ |), k = 1,2. 
j=l 


(*) See, for example, Banach [1, p. 80]. 


i 


4 


82 J. G. HERRIOT ° [July 
Then at any point (x, y) such that 
h k 
(6.02) k) = f dt f | bey(t, | du = o(hk) 
0 0 


as h, k—0 simultaneously but independently, a(f) is restrictedly summable N, 
to f(x, y). 


Proof. Let \21 be any fixed number. It suffices to show that tnn(x, y; f) 
—f(x, y) as m, n—© in such a manner that m/n<X, n/msSX. Fix (x, y). 
Given e>0, we can choose 6>0 such that 1/6 is an integer greater than 2 and 
such that 


(6.03) k) < ehk, 
whenever 0<h, kS56. Then from (2.11) we obtain 


| tma(x, f) — f(x, »)| 


(6.04) + f f | dey(t, (us) | dudt 
0 0 


On account of Lemma 1, the estimates of §3 may be applied here. From 
(3.15), (6.01) and Lemma 1 we have that N®(#)N®@(u)—0 uniformly in 
[5, x; 5, x]. Thus J; is small for all sufficiently large m and n. If m/n sh, 
n/m we see from (3.03), (3.15), (6.01) and Lemma 1 that N{(t)N®(u) 
is bounded in the domains of integration of Jz and J;. Thus we can find 6’ 
such that 0< 6’<6 and so small that 


(yw? (u) | dudt 


is uniformly small. In the remainder of the domains of integration of J: 
and J3, N°) (t)N®(u)—0 uniformly and thus J:+J; is small for all sufficiently 
large m and m such that m/n SX, n/m. Thus we can find No>1/6 such that 
if m,n >No, m/nsxX, n/msx. In the following we suppose 
m,n > No. Then 


(6.05) + f f u)Ns (u) | dudt 
1/m I/n 


‘ 


1942] DOUBLE FOURIER SERIES 83 


Then from (3.03) and (6.03) we have at once Ju SAe. Also by (3.03) and 
(3.12) we have 


(1) (1) 


(6. 06) J 42 s An “aw f | zy(t, u) | + M + 


= + + 
Then from (3.13) 


1 
Jin = An>, auf | (> 
1/(4+1) t 


jut/8 


An 


A m—1 
i 


= 1/6 
RY ’ 


But by Lemma 2 and (3.06) we get 


(k) 


Thus Ji, S$ Ae. Substituting from (3.13) in J2, integrating twice by parts and 
using Lemma 1 we have J3,< Ae. Again from (3.13) we have 


1 
Jin = An uf | u) | dt 
jm 1/(4+1) PR® t 


An a) 1 


m—1 1 1 1 


/8 Jj 


1 


m—1 24 1 


R 
R 


84 J. G. HERRIOT 
But by (3.06) and (6.01) 


(k) 2 (k) (k) 
(Ve -V; Dip — 
= (k) 
(1/8)+1 
Likewise 


Thus J3,sAe. in we have JeSAe. In the same way 
Ji Ae. Turning now to Ju we have by (3.12) 


Jus J. dt f | m) | + + Mona(t)} 
1/n 


(6.10) Mar (u) + Mas (u) + MQ 


j=l 


We now show that SAe (j=1, 2,3, +++ ,9) as was done with the Jj. For 
example, let us take J%,. Then from (3.13) 


(u) } du 


m—1,n—1 1/4 ql) 1 
=A > f a | bey(t t,u u) du 


(1) 
A m—1,n—1 1 1 


RYR® i /8 k 


Applying Lemma 3 and dropping clearly negative terms we obtain — 
Are m—1,n—1 1 1 
us RYOR® j 
i + 1)[(2k + 
(2) 


—Vi)—k — peal] 
1 
jm 1/8 J 


+ a) a) 


tas +1)(Ve—Ve) =k peal] 


1 1 


1942] DOUBLE FOURIER SERIES 85 


Again dropping those terms which are negative and applying (6.03), (6.01), 
(6.08), (6.09) and Lemma 1 we obtain Ji,< Ae. Altogether, then, JusAe. 
Combining our estimates in (6.05) and (6.04) we have | tan (25 yi fi—f(x, y)| 
SAeif m,n>No, m/n 3X, n/m Sx. This completes the proof of the theorem. 


CorROLLArY 3.1. Let N, be a double Nirlund method of summability satisfy- 
ing (6.01). Then restricted N, summability possesses the localization property. 


Before showing (6.01) is also partly necessary in order that restricted NV, 
possess the localization property we prove the following lemma. 


Lema 5. Let N, be a double Nérlund method of summability with p* =0, 


® non-increasing (k=1, 2). Then 


Proof. From (2.10) we have 


1 nn F 
Na (x) = 4(-1)'= 


(0) | 2 k= 1,2. 


The first inequality of cil follows eee Also from (2.10) 


Thus 


(6.12) 


But since P® is Ra sar we have (P®)-1)>%-) P™ <n and thus 
(k 1 


(k) 


Substituting this in (6.12) we obtain the second inequality of (6.11). 
THEOREM 4. Let N, be a double Nérlund method of summability with p® =0 


| 
4 
1 
| 
(k) 1 
or 
n—1 k 
PH 


86 J. G. HERRIOT [July 


p® non-increasing (k=1, 2). Suppose p n/PP—+@ as fork=1 
or 2 or both. Then there exists f vanishing in a neighborhood of (0, 0) such that 
lim | tan(O, 0; f)| =+o, 


The proof is analogous to that of Theorem 2, using Lemma 5 instead of 
Lemma 4. 
As in §5 we may prove the following corollaries: 


Coro.iary 4.1. Let N, be a double Nérlund method of summability with 
non-increasing, p (k=1, 2). Then (6.01) is necessary as 
well as sufficient in order that restricted N, possess the localization property. 


CoROLLARY 4.2. Under the hypotheses of Corollary 4.1, necessary and suffi- 
cient conditions that restricted N, summability should possess the localization 
property are n=O(P™) (k=1, 2). . 


It follows that restricted (C, a, 8) possesses the localization property if 
and only ifa21,821. 

7. Preliminary lemmas for almost everywhere results. We turn now to 
the study of methods of summability which sum the double Fourier series al- 
most everywhere. The results are generalizations and extensions of those due 
to Marcinkiewicz and Zygmund [5, 6] and Griinwald [3]. The proofs are 
based on those given by Marcinkiewicz and Zygmund. We shall require the 
following lemmas. 


Lemma 6. Let a be any fixed positive number. For (x, y) belonging to the 
square Q 7; we write 


1 ah h 
d > 
wf y + u)| dt 


a 1 t— t+u 
(7.02) fa (x,y) = 21/2 ) ds 


(7.01) 9) = sup 


where the number h is so small that the rectangles over which the integrals are 
taken are contained in Q’ [—2nx, 24; 2x]. Let 


= E [f(x,y >t], ®= 9) >] 


(2,y) (z,y) 


for any &>0. Then 


(7.03) f y) | dxdy. 


In the case of f.*(x, y) the proof was given by Marcinkiewicz and Zygmund 
[6]. For the case of f.**(x, y) the proof can be carried through in the same way. 


1942] DOUBLE FOURIER SERIES 87 


Lemma 7. Let a be any fixed positive number. For (x, y) belonging to the 
square Q 7; 1], we write 


(7.04) 9) = sup (fe(x, fors = 0,41, 42,---, 


(7.05) fx, y) = sup (fe (x, 2"), fors=0,+1, +2,- 
We write 


E*-(t) = E [f*e(x, y) Er*e(t) = E [f**e(x, y) > €] 


(2,4) (2,9) 


for any &>0. Then 


| | < A 
7.06 dxdy, 
Gj ) | E**2(¢) | | f(x, 9) | 


where A(a) depends only on a. 


The proof is similar to that given by Marcinkiewicz and Zygmund for 
their Lemma 3 [6]. 


LemMA 8. Suppose P,20, P, non-decreasing, a=0. Then the condition 


(7.07) (=) - O(P,) 


1s equivalent to the condition 


n—1 
(7.08) Dd = O( Ps). 
k=O 


Proof. Suppose (7.08) is satisfied. Let 7 be an integer such that 2/ sn < 2/+!, 
Then 
n i-1 Pe ( nN y 
b (=) = (=) + 
( n y} 
k 


Pen + Pe? 21.2 


j-1 
2% 4 Py + O(P,) 
k= 


showing (7.07) to be satisfied. The proof of the converse is similar. 


= O(P#) + O(P,) = O(Ps) 


88 J.G. HERRIOT (July 


8. Lemma for restricted Nérlund summability. Before proving our first — 
result on almost everywhere summability we need a lemma. 


LEMMA 9. Let N, be a double Nérlund method of summability. Suppose there 
exists a constant a>0O such that 


n n a 
(8.01) Dil — (5) = 0(| |), k = 1, 2, 
j=l 
and 
(k) 


(8.02) > O(| Ps” |), k= 1,2. 
j=l J 


Let X21 be any fixed number. Let 


(2) 


(8.03) n(x = sup ff y+ | didn 


where Then for any 


A xe 
(8.04) | E {[(x, | s f f | f(x, y) | 
(2,¥) 


where A(a) depends only on a. 


Proof. If (8.01) and (8.02) hold for any a=a,)>0, then they also hold for 
all a such that 0Sa Sao. Hence we may suppose that 0<a<1. 

Let j, k be integers such that 2/Sm<2/+!, 2'sn<2*+!, Also let m/n Sk, 
n/m sx. Then 2!#*! 2X. Let 


(2) 


tmn( 2, = +t, y+u)Nw (u) | dtdu 


+f is y+u)Nm (Nn (u) | dudt 


= Pi(x, y) + y) + Rie(x, y) + 9). 


On account of Lemma 1, the estimates of §3 may be applied. Then from 
(3.03) and (3.12) we have 


Am | f(x +t, y+ u)| 
0 ® 


R® 


By (7.01) and (7.04) the rth term of the sum on the right does not exceed 


Pix(x, y) 


DOUBLE FOURIER SERIES 


Am r (2) ( 2 (2) (2) ( 
@) 2 { 2 fr 


af +t, y+ u)| du 


Am2-i 


In order to sum these terms we shall need the following: 


= 


(8.07) < ARP, 


r=0 


(8.08) Qrtetk—-r) < A2* 


(8.09) ve y? _ < AR”. 


To prove (8.07) we apply (3.06) to (8.02) and make use of Lemma 8. (8.08) 
is immediate. For (8.09) we first note that 


2 
T 


s=1 


2 
y” y? (=) < y” > Wag Fock 


—24-1 


y= 2,3,4,---,k—1. 


Substituting in the left side of (8.09), reversing the order of the summations 
and denoting the greatest integer less than or equal to 2+logs (s—1) by g(s), 
we have 


k-1 2 
_y®” (=)| 
n 


(ols)+1) — 4 (2) 


Then (8.09) follows from (8.01). 


1942] 89 
(8.06) 
| 
- 


90 J.G. HERRIOT [July 


Summing (8.06) from r=0 to k—1, considering separately the cases j 2k 
and j<k, and using (8.07)—(8.09) we get 


(8.10) Pix(x, y) S Ad*f**(x, 9). 
In the same way we obtain 
(8.11) Qin(x, y) S Ad*f*e(x, 


Next from (3.12) we have 


i-1,k—1 
Ru(*,y) SA Zn 


where 


Zn= f as f | fatty +m) | {Man + Maa) + Mons 


q-r-1 


()} 
{M@ (u) + MQ (u) + MY (u)} du. 


Each term of this sum consists of 9 parts each of which may be summed by 
making use of (8.07)-(8.09) and the analogue of (8.09). For example, let us 
consider that part arising from M)(t)M®(u). The general term in this sum 
does not exceed - 


A (1) 02) (2) 2°\ ] 
[vi (=)|/ (a9). 


Considering the case j 2k we have 


&+a\r—s 2° 
T 
s=0 rms 


r=0 
2 ils _y” (=)| Rr? 
< 
The case k>j can be treated similarly. Thus it follows that 
(8.12) Ryr(x, y) y). 


Finally by (3.03) 


j 


1942] DOUBLE FOURIER SERIES 91 


—r2+ 
S f¥o(y, y) S Ad*f**(x, y). 


Combining (8.10)—(8.13) we see that (x, y) SAX*f**(x, y). But the integral 
on the right in (8.03) is the sum of four integrals, all analogous to /,(x, y). 
Thus Ay(x, y; f) SAAF**(x, y). (8.04) now follows directly from Lemma 7. 

9. Restricted Nérlund summability almost everywhere. We are now ready 
to prove our first theorem on almost everywhere summability. 


THEOREM 5. Let N, be a double Nérlund method of summability. Suppose 
there exists a constant a>O such that (8.01) and (8.02) are satisfied. Then a(f) 
ts restrictly summable N, almost everywhere to f. 


(8.13) ns Amn f 


Proof. This theorem follows immediately from Lemma 9. It suffices to 
make a decomposition f=fi+/2 where f; is a trigonometrical polynomial and 
f2 is such that 


RA | fo(x, y)| > 8} | <a, 


and 


E {lim sup | tmn( x, y; > 5} <6 


(m/nsSd, n/m, X21 any fixed number), where 6 is a fixed positive 
number as small as we please. Since tmn(x, y; fi)—>fi(x, y) it follows that 
lim sup | tn (25 —f(x, y)| where m, n— © in such a manner that m/n 3h, 
n/m Sd does not exceed 26 except on a set of measure less than 26. This com- 
pletes the proof of the theorem. 

The result of Marcinkiewicz and Zygmund [6], namely that o(f) is re- 
strictedly summable (C, a, 8), a, 8>0, almost everywhere to f, follows im- 
mediately from Theorem 5. 

10. Lemma for square partial sums. Turning now to the almost every- 
where summability of the square partial sums we require the following lemma. 


LEMMA 10. Let N, be a regular Nérlund method of summability. Suppose 
there exists a constant a >0 such that 


(10.01) = 0(| Pal) 


j=l 


and 


(10.02) (*)- O(| Pa|). 


_— 


92 J. G. HERRIOT 


Let 


(10.03) h*(x, y; f) = | f(x +t, y + u)Kn(t, | dtdu. 


Then for any 


A 
(10.04) E {Ue > | f f | f(x, 9) | dxdy 
(2,y) 


where A (a) depends only on a. 


Proof. As in Lemma 9 we may suppose 0<a<1. Let & be an integer such 
that 2*Sn<2*+!, Let D be the part of Q (—z, 7; —7, 7) in which ¢, u20. 
Let D™ be the part of Q in which t, u>a/2. Divide D—D™ into 9 domains 
DY (i=1, 2,3, +++, 9) asin the proof of (3.16), the only difference being that 
in all the inequalities defining the regions 1/n is replaced by r2~*—'. We shall 
evaluate separately the integrals 


Ay -ff | f(x +t, + u)Ka(t, u) | dtdu, i= 1, 2, » 9, 
(10.05) 
(0) 
A =x Sf. | f(x +t, y+ u)K,(t, | dtdu. 


This may be done by methods similar to those used in the evaluation of 


Px(x, y) and so on in Lemma 9. First of all from (3.01), (7.01) and (7.04) 
we get 


(10.06) Ay (x, 9). 


In D® we note that uSt/2, 1/nSt+uS2t. Apply- 
ing (3.11) and using the relations (8.07)—(8.09) we easily get 


(10.07) Ay (x, 9). 


In D®, t2uzt/2, t+us2ts4u. Applying (3.02) and making the transforma- 
tion t= 2-1/2(¢/ — 4’), w= 2-"/2(t’ +n’) we have 


ke t— t+u 
A; auf y+ a) t-°dt 
t— t+.u ads 


By (7.02) and (7.05) and taking account of (8.08) we have 


(10.08) A, Af*(x, 9). 


July 


1942] DOUBLE FOURIER SERIES 93 


In D®, uSt/2, 1/nSt+us2t. In D®, t—uz1/n, 
t2u2t/2, 1/nSt+us2ts4u. Then by (3.10) we have 


As 


Now it is clear that 
k 2-8 
ff (+++ )dtidu s >> au f ( 


ms 


Using these facts and (8.07)—(8.09) we find that Ji+J2S Af**(x, y). Trans- 
forming J; by the substitution ¢=2-1/2(t’—1’), u=2-V2(t' +4"), noting that 


du’ dt’, 


and using (8.07)—(8.09) we find that J; S Af***(x, y). Thus 


where 


(8) 


(10.09) Ay +Ax SA{f(x, 9) »)}. 
Considering now the symmetric domains we have also 


(10.10) Ap +Ap +4, +4y 9) + y)}. 


In D®, t, u>m/2. Applying (3.02), (7.01) and (7.04) we get 
(10.11) Af**(x, y). 


Combining (10.06)—(10.11) and noting that Q is the sum of four domains like 
D we have 


(10.12) h*(x, y; f) S A{f*e(x, y) + f**#(x, y)}. 


94 J. G. HERRIOT 


(10.04) now follows immediately from Lemma 7. 
11. Summability of the square partial sums almost everywhere. 


THEOREM 6. Let N, be a regular Nérlund method of summability. Suppose 
there exists a constant a>0 such that (10.01) and (10.02) are satisfied. Then the 
sequence | Snn(x, ¥;f)} is summable almost everywhere to f(x, y). 


Proof. This theorem follows immediately from Lemma 10 just as Theorem 
5 follows from Lemma 9. 
We easily obtain the following corollary of Theorem 6. 


Coroiary 6.1. The sequence {Sun(x, ¥3f)} is summable (C, «), a>0, al- 
most everywhere to f(x, y). 


In conclusion let us note that conditions (5.02) and (6.02) do not in gen- 
eral hold almost everywhere. Hence it was not possible to deduce any results 
concerning almost everywhere summability from Theorems 1 and 3. However 
(C, x) applied to the square partial sums of the double Fourier series is effec- 
tive almost everywhere if a>0 but possesses the localization property if and 
only if a21. Also restricted (C, a, 8) summability of the double Fourier series 
is effective almost everywhere if a, 8>0, but possesses the localization prop- 
erty if and only ifa21, 621. 


BIBLIOGRAPHY 
1. S. Banach, Théorie des Opérations Linéaires, Warsaw, 1932. 


2. G. Griinwald, Zur Summabilitatstheorie der Fourierschen Doppelrethe, Proceedings of the 
Cambridge Philosophical Society, vol. 35 (1939), pp. 343-350. 
Pop 3. , Uber die Summabilitat der Fourierschen Reihe, Acta Litterarum ac Scien- 
tiarum Szeged, Sectio Scientiarum Mathematicarum, vol. 10 (1941), pp. 55-63. 

4. E. Hille and J. D. Tamarkin, On the summability of Fourier series. 1, these Transactions, 
vol. 34 (1932), pp. 757-783. 

5. J. Marcinkiewicz, Sur une méthode remarquable de sommation des séries doubles de Fourier, 
Annali della R. Scuola Normale Superiore di Pisa, (2), vol. 8 (1939), pp. 149-160. 

6. J. Marcinkiewicz and A. Zygmund, On the summability of double Fourier series, Funda- 
menta Mathematicae, vol. 32 (1939), pp. 122-132. 


BROWN UNIVERSITY, 
PROVIDENCE, R. I. 


TRANSITIVITIES OF BETWEENNESS 


BY 
EVERETT PITCHER AND M. F. SMILEY 


Introduction. The examination of the foundations of geometry which in- 
terested many prominent mathematicians about the turn of the century 
brought to light the importance of the fundamental notion of betweenness 
(see, for example(*), [10, 11]). This notion has suffered the treatment which 
modern mathematics metes out to all its concepts, namely, first an examina- 
tion of the concept in a particular instance followed by wider and wider gen- 
eralizations. The first part of this program for the concept of betweenness was 
carried through by Pasch, Huntington and Kline [8, 10]. The simplicity of 
the concept permitted them to give an elegant and complete theory for the 
case of linear order. In the direction of generalizations(?), K. Menger and his 
students have been one of the most important influences in the study of be- 
tweenness in metric spaces [9, 3]. 

We purpose here to add to both phases of this program. The first part of 
our paper continues the analysis of Huntington and Kline into an examina- 
tion of postulates involving five points; the second part deals mainly with a 
definition of betweenness in lattices which generalizes metric betweenness in 
metric lattices (see [5, 6]). It is hoped that the five point transitivities may 
prove interesting and their analysis valuable. If we restrict our attention to 
the relation of betweenness in linear order such cannot be the case since four 
point properties are then sufficient to describe completely the betweenness 
relation. We feel that the results of the second part exhibit the properties of 
the betweenness relation as reflections of properties of the underlying space(*). 

We shall use the notations of set theory which have become standard. In 
the second part we shall assume a knowledge of the fundamentals of both 
lattice theory and metric geometry. We refer the reader to the recent books 
Distance Geometry by L. M. Blumenthal [3] and Lattice Theory by Garrett 
Birkhoff [1]..We shall use the terminology and notation of. these books in 
the second part. 


Presented to the Society, May 2, 1941; received by the editors June 11, 1941. 

(4) The numbers enclosed in brackets refer to the list of references at the end of the paper. 

(?) The chordal systems recently introduced by W. Kaplan (Duke Mathematical Journal, 
vol. 7 (1940), pp. 165-167) are a generalization of linear order involving two triadic relations. 

(8) The oft-quoted remark of K. Menger that Postulate B of Huntington and Kline should 
not be regarded as a property of betweenness but as a property of the underlying space [9, p. 79; 
3, p. 36] indicates that it is easy to lose sight of this fact. 


95 


4 
| 
| 
| 
i 
| 


EVERETT PITCHER AND M. F. SMILEY 


TABLE OF CONTENTS 


Part I. TRANSITIVITIES OF BETWEENNESS 


1. Fundamental 96 
3. Weak transitivities on four points............cescceeeeeccccceeeecercsseseeees 97 
5. Selection of fundamental five point transitivities................cceceeeeeeenees 100 
6. The logical relations among the fundamental strong transitivities................. 101 
7. The examples in the existential theory. .............cccecceeecececnceeeccesees 103 
Part II. APPLICATIONS 
9. Interpretations of certain of the five point transitivities for lattice betweenness.... 106 
11. Betweenness in metric, semi metric, and metric ptolemaic spaces................. 113 
Part I 


We shall extend the discussion of an abstract relation of “betweenness” 
initiated by Pasch [10] and developed by Huntington and Kline [8] by re- 
laxing some of the fundamental postulates of Huntington and Kline and by 
considering other possible postulates, particularly transitivities on five points. 

1. Fundamental assumptions. We consider a set K of pointsa,b,c,d,x,---, 
and a triadic relation called betweenness, which holds (is positive) or fails (is 
negative) for each ordered triple of points, not necessarily distinct, in K. If 
the relation holds for the triple a, b, c, we write abc, read as written or as 
“b is between a and c.” We make the following assumptions throughout 
Part I. 

a. abc if and only if cba (symmetry in the end points). 

B. abc and ach if and only if b =c (closure). 

Postulate a is Postulate A of Huntington and Kline [8]. Postulate B is 
similar to their Postulate C. Postulates a and 8 together imply the statements 
(1) and (2) below. 

(1) aba if and only if a=b. 

(2) Every two positive relations on three points (not necessarily distinct) are 
equivalent or inconsistent. 

We do not assume that of an unordered triple of points one is between 
the other two (Postulate B of Huntington and Kline). In this respect our de- 
velopment will differ materially from theirs. This difference is essential be- 
cause our interest lies in applications to lattices and metric spaces where B 
fails for very simple examples. We have replaced their Postulate D which re- 
quires that the three points of a linear triple be pairwise distinct by 8 because 
we wish our five point transitivities to specialize under identification of two 
points to four point transitivities. This change, though logically necessary, is 
essentially only a change in terminology. 


96 
; 


1942] TRANSITIVITIES OF BETWEENNESS 97 


2. Four point transitivities. The statements about four points in which 
two positive relations of betweenness imply a third, which are theorems about 
linear order, and from which no hypothesis can be deleted leaving an equiva- 
lent statement, will be termed strong transitivities on four points. They are as 
follows. 

ty. abc -adb—dbc. 

te. abc -adb—adc. 

ts. abc -bcd -bc—abd. 

These are postulates (3), (2), and (1), respectively, of Huntington and Kline 
and are completely discussed by them. We shall need the fact that the only 
implication among the three is: “t; -ts—t2” [8, p. 321]. 

3. Weak transitivities on four points. The statements concerning four 
points in which three distinct positive relations of betweenness imply a fourth, 
which are theorems about linear order, and from which no hypothesis may be 
deleted leaving an equivalent statement, will be termed weak transitivities on 
four points. They are as follows. 

71. abc -adb -adc—dbc. 

abc -adb -dbc—adc. 


THEOREM 3.1. The statements 7; and Tt: are the only weak transitivities on 
four points. The implications hold. 


Proof. The second assertion is trivial. In order to prove the first assertion, 
we first observe that no two relations in hypothesis or conclusion of such a 
statement involve the same three letters by virtue of condition (2) of §1. 
Next, there are four ways of selecting unordered triples from four letters. 
Let abc be the first hypothesis and a, b, d be the letters in the second. Since 
.a, 6, d or b, c, d must occur in one hypothesis, it is always possible to achieve 
this situation by renaming the points. The second hypothesis is then one of 
the three, (1) dab, (2) adb, or (3) abd. The third hypothesis is on the points 
(i) b, c, d or (ii) a, c, d. We examine the possible relations on the three letters 
in one of the sets (i), (ii) with (1), (2), or (3) for consistency with linear order; 
we then examine the other one of the sets (i), (ii) for a conclusion of a theorem 
about linear order. In the eight cases we have: 


Third hypothesis Conclusion 
(1) (i) bed acd 
(1) (ii) acd bcd 
(2) (i) dbc adc 
(2) (ii) adc dbc 
(3) (i) bed acd 
(3) (i) bdc adc 
(3) (ii) acd bed 


(3) (ii) adc bdc 


{ 
} 
{| 
i 
i] 
i 
4 
i 
| 
{ 


98 EVERETT PITCHER AND M. F. SMILEY [July 


It is readily seen that these eight theorems reduce to two on suitable per- 
mutations of the letters of the letters of the hypotheses and conclusions. These 
two are 7; and 72. This completes the proof. 

From this discussion of weak transitivities on four points it is apparent 
that an attempt at a weaker statement about four points with four or more 
hypotheses and one conclusion must contain two hypotheses or a hypothesis 
and a conclusion identical under @ or a hypothesis or conclusion true under 8. 

4. Five point transitivities. The statements concerning five points in which 
three positive relations of betweenness imply a fourth, which are theorems 
about linear order, and from which no hypothesis can be deleted leaving an 
equivalent statement, will be termed strong transitivities on five points. They 
are as follows. 

. abc -adb -xdb-b # d — xdc. 
. abc -adb-bex-b # dex. 
. abc-adb-xcd-c # d — acx. 
. abc-dab- — abx. 
. abc- . — axc 

. abc-adb- — dex. 
. abc -abd - — abx. 
. abc -dab-xcd-a b — acx. 
. abc -dab-xcd-a # b — bex. 

. abc -abd -xbc-a #b-b xbd. 


-a b xad. 
-a # b— xde. 
-a b— xde. 
— xad. 
— xdb. 
— xde. 
— adx. 
— dbx. 
-a #d— xac. 
-a¥#d—xbe. 
-a#d—<xde. 
— adx. 
— dbx. 
-b # adx. 
-b — dbx. 
d— xbe. 
-¢ #d—abx. 
-¢ #d—adx. 
— dbx. 


ok * 
Tu. abc -adc -xab 
Tx. abc -adc -xab 
abc -adb -xab 
Ty. abc -adb -xac 
Tis. abc -adb -xac 
Ti. abc -adb -xac 
Tir. abc -adb -acx 
Tis. abc -adb -acx 
abc -adb -xad 
T20. abc -adb -xad 
Tx. abc -adb -xad 
T2. abc -adb - bxc 
T23. abc -adb - bxc 
Tx. abc -adb -bcx 
T2. abc -adb -bcx 
Tx. abc -adb -bdx 
abc -adb -xcd 
T23. abc -adb -xcd 
: T29. abc -adb -xcd 


TRANSITIVITIES OF BETWEENNESS 


Tso. abc -adb -xcd — bex. 
Tu. abc -adb -cxd — axc. 
Tx. abc -adb -cxd — adx. 
T33. abc -adb -cdx — xbe. 
Tx. abc -adb -cdx — xdb. 


abc -dab-adx-a # b-a #d—xac. 
Ts. abc -dab -adx-a #b-a ¥d—xbe. 
T37. abc -dab -xcd -a b — dax. 
T33. abc -dab -xcd -a # b — dbx. 


THEOREM4.1, The statements T:-T33 are a complete list of strong transitivities 
on five points. 


Proof. In order to effect this enumeration we reason as follows. One letter 
in the three hypotheses must occur only once since there are nine places to 
be filled with five letters. We shall denote this letter (or one such if there are 
more) by x and agree that it occurs in the third hypothesis. Then the first 
two hypotheses are on four letters with two letters in common. Letting abc 
be the first hypothesis and d the remaining letter, we see that the second 
hypothesis must be on the letters (i) a, c, d or (ii) a, 5, d; the case 5, c, d 
reduces to a, b, d on interchange of a and c, which by virtue of a does not 
change abc. Calling a and c in abc terminal and b medial, we see that the let- 
ters common to the first and second hypotheses must fall under one of the 
following cases: 

I. each letter terminal in both hypotheses; 

II. one letter terminal in both; one letter medial once (say in the first) 
and terminal once; 

III. one letter terminal in both; one medial in both; 

IV. each letter terminal once and medial once. 

Possible pairs of hypotheses to fill the first two places are then 


A. abc and adc, C. abc and abd, 
B. abc and adb, D. abc and dab. 


On examination one will find that we have used the pairs I (i), II (ii), III (ii), 
and IV (ii). The pairs I (ii), II (i), III (i), and IV (i) are incompatible with 
linear order. With any of the pairs A, B, C, D we can use a third hypothesis on 


(1) a, b, x, (2) 4, ¢, (3) a, d, x, 


1) (4) b,c, x, (5) 5, d, x, (6) c,d, x. 


Every letter which occurs only once in the hypotheses must occur in the con- 
clusion. For, if we had a theorem about linear order for which this were false, 


1942] 99 


100 EVERETT PITCHER AND M. F. SMILEY [July | 


we would obtain an equivalent one by dropping the hypothesis containing the 
letter; and we have agreed not to consider statements with this property. 
In the six subcases (1)—(6) under A we can use a conclusion on 
(1) or (4) a, d, x; 6, d, x; ¢, d, x. 
(2) b, d, x. 


(3) a, b, x; b, c, x; b, d, x. 


(5) a, b, x; a, c, x; a, d, x; b, c, x; ¢, d, x. 


(6) a, b, x; b, c, x; b, d, x. 
In the six subcases (1)—(6) under B, C, or D we can use a conclusion on 


(1) ¢, d, x. 
(2) or (4) a, d, x; b, d, x; ¢, d, x. 


(4.3) 
(3) or (5) a, c, x; b, c, x; ¢, d, x. 


(6) a, b, x; a, c, x; a, d, x; b, c, x; b, d, x. 


We proceed then to an examination of cases according to the following plan. 
With each pair A, B, C, D we inspect the three arrangements of the letters 
in the six cases of (4.1) to see whether they are consistent with linear order. 
With each consistent arrangement we inspect each of the three arrangements 
of letters in the corresponding set of (4.2) and (4.3) to determine whether a 
theorem in linear order is obtained. The work is shortened by examining each 
set of three hypotheses as we proceed to see whether it has already occurred 
under some permutation of the letters; this examination is facilitated by ap- 
plying the classification scheme I-IV to the three pairs of hypotheses. 

This procedure yields the transitivities 7;-73. though not in that order, 
and the proof is complete. 

The reader will observe that the program initiated in the determination 
of the transitivities 71, 72, and 7;—T3s could be extended to include transitivi- 
ties with more hypotheses and more letters. We shall not do this. 

5. Selection of fundamental five point transitivities. Each of the transi- 
tivities Ti:—T%3 is equivalent to a combination of the transitivities t;, te, ts. We 
shall state and prove these facts in the following form 

Tu. ~. [b=c; c=d. abc, xab (ts) xac, adc (t;) xad. | 
We mean that Ti: is equivalent to ¢, and ¢;; that ¢; is proved by identifying } 
and c; that ¢; is proved by identifying c and d; that Ty is proved from #; and fg 
by applying ¢s to abc and xab to obtain xac, and by applying 4; to xac and adc 
to obtain xad. We use a and 6 freely without explicit reference. Whenever 
we use ¢; the letters common to the two hypotheses are distinct as required. 


(4.2) 


TRANSITIVITIES OF BETWEENNESS 101 
Tw. ~. [b=c; a=d. abc, xab (ts) xac, adc (te) xdc.} 
Tis. ~. [b=c;a=d. abc, adb (tz) adc; abc, xab (ts) xac, adc xdc.} 
Tu. ~. tr [b=c. abc, xac (t;) xab, adb (t;) xad. | 
Ty. ~. t-te [a=d; b=c. abc, xac xab, adb xdb. | 
Ti. ~. te [a=x. abc, adb (te) adc, xac xdc.] 
Tix. ~. te [c=x. abc, adb (t2) adc, acx (tz) adx. | 
Tu. ~. [b=c; a=d. abc, acx abx, adb dbx.] 
Tis. ~. ts [b=c. adb, xad (ts) xab, abc (ts) xac. | 
T20. ™. ts [b= d. adb, xad (ts) xab, abc (ts) xbe. ] 
Tx ~™. lets [a =X; b=c. abc, adb (te) adc, xad (ts) xdc. | 
Tx. ~. [b=d; c=x. abc, bxc abx, adb adx.| 
~. ty [a=d. abc, adb (t;) dbc, bxc (ty) dbx.] 
Tu. ~. te-ts [c=x; b=d. abc, bex (ts) abx, adb (t2).adx. | 
Ts. ~. ty-ts [c=x;a=d. abc, bex (ts) abx, adb (t;) dbx. | 
Tx. ~. tits [d =x; a=d. abc, adb (t;) dbc, bdx (ts) xbc.] 
Tor. ~. te-ts [a =d; b=c. abc, adb adc, dex (ts) acx, abc (te) abx.] 
~. te-ts [c=x; b=c. abc, adb (t2) adc, xcd (ts) adx. | 
T29. ~. tite [c=x; a=d. abc, adb dbc, xed (tz) dbx. | 
Tx. ~. th [a=d. abc, adb dbc, xed bex. | 
Tu. ~. te [b=c. abc, adb (te) adc, cxd (te) axc.| 
Tx. ~. tite [b=c; c=x. abc, adb adc, cxd adx.| 
Tx. ~. [d =x; a=d. abc, adb dbc, cdx (te) xbc. | 
Tu. ~.h [a =d. abc, adb (t1) dbc, cdx (t1) xdb. | 
Tx. ~. ts [b=c. abc, dab (ts) cad, adx (ts) xac. | 
Tx. ~. ts [d =x. dab, adx (ts) xab, abc (ts) xbc. | 
Ts1. ~. ta-ts [b=c; c=x. abc, dab (ts) dac, xcd (t2) dax. | 
Ts. ~. tats [a=d; c=x. abc, dab (ts) dbc, xcd (tz) dbx. | 


We summarize the results of this section in the following theorem. 


THEOREM 5. Each of the transitivities Ti:—T 3 is equivalent to a combination 
of the transitivities ty, te, and ts. 


6. The logical relations among the fundamental strong transitivities. 
None of the transitivities T;-T» is equivalent to a combination of the transi- 
tivities t;, t2, ts. We shall devote this section and the following one to a proof 
of this fact. In addition we shall construct the essentials of a complete existen- 
tial theory of t:-ts, T:-Tio. The basic implications are given in this section. 
We use the notation explained in §5. 


ty-ts —> [abc, adb (t;) dbc, xdb (ts) xdc. a=x;a=d.] 


th: Tz ts [abc, adb (t2) adc; abc, bex (ts) acx, adc (t;) dex. a=d.]} 
te-ts + Ts — ts [abc, adb (tz) adc, xcd (ts) acx. b=c.] 


1942 


EVERETT PITCHER AND M. F. SMILEY [July 


ty-ts Ty — tite [abc, dab (ts) dbc, xcd (t,) dbx, dab (t;) abx. b=c; a=d.] 
Ts— ts [a=5.] 

ty-tz—> [abc, adb adc, acx dex. b=c.| 
[b =d.]} 

ty-ts—> T3—> th [abc, dab (ts) dac, xed (ty) acx. b=c.] 

ty-ts [abc, dab (ts) dbc, xcd (t1) bcx. a=d.| 


It is apparent that when two letters of a statement 7;-7, are identified, 
the resulting statement is either a tautology or is equivalent to one of the 
statements tf, tz, ts, 71 Or T2. We may see that we cannot thus obtain either 7; 
or 72 as follows. Notice that the hypotheses of both 7; and 72 contain one letter 
three times and three letters twice and that the conclusion of each is on these 
latter three letters. Suppose that an identification of two letters leads to 71 
or 72. Then x must be identified with some letter because it occurs only once 
in the hypotheses of each of T7;-Ti9. Since x always appears in the conclusions 
of T:-Tio, the letter to be identified with x can occur only once in the hy- 
potheses. It must then also occur in the conclusion along with x. By virtue 
of 6 the conclusion is then either vacuous or implies still further identification. 
This contradicts our assumption that one of the statements 7; or T2 appears 
on identifying two letters. 

The above list of implications includes all nontautological results obtained by 
identifying two letters. This fact will be useful in simplifying the examination 
of the table of examples to be given in §7. 

In the proofs of the following implications, the results of the preceding 
implications are used. 


Ts—> Te [abc, adb (t:) dbc, adb, acx, (Ts) dex. 
If d=b then abc, acx (t:) dcx. | 
Ty [abc, dab, xcd, (Ts) acx, abc (t:) bex. | 
Ts [abc, dab, xcd (Ts) abx; abc, dab, xcd, axb bex; abx, bex 
(ts) acx. | 
[abc, dab, xcd, aXxb (Ts) bex, dab, xcd (T2) xca.| 
te-T3 T, [abc, dab, xcd, (Ts) acx, abc (t2) abx. If a=b, T, is true. 
th-T7—> T¢ [abc, adb (t;) dbc; abc, acx bex, acx, adb (Tz) dex. ] 
te-ts- Tio > [abc, adb (t2) adc, adb, xdb, axd, b¥d (Tyo) xdc. 
If a=d, abc, xab (ts) xdc.] 


We shall devote the next section to the proof of the following theorem. 


THEOREM 6. The implications listed in this section are the only ones holding 
among the statements t;-ts, T:-T10. 


REMARK. It seems to be worth mentioning that t;-te-ts — T1, Tz, Ts, Ts, Te, 
Ts, Ts; but that t;-ta-ts does not imply Ts, Tz, or Tio (for proof see §7). We are 
of the opinion that the interest of a five point transitivity varies inversely 


102 


1942] TRANSITIVITIES OF BETWEENNESS 103 


as its logical intimacy with h, tf, and ts. Viewed in this light, To is surely the 
most interesting—but we still lack a “concrete” interpretation for it. 

7. The examples in the existential theory. We shall complete the existen- 
tial theory begun in §6. We have attempted to make our list of examples as 
simple as possible through the use of composite examples. No attempt has 
been made to make the number of points in each example the least possible 
[14, p. 250]. 

The following elementary examples will be used in the table which con- 
cludes this section. In each of the examples the positive relations are those 
listed together with the ones which follow from a and 8. In the first four ex- 
amples the class K consists of four distinct points, while in the remaining 
examples K consists of five distinct points. Certain of these examples are 
merely the statement of the hypotheses of one of the transitivities. We indi- 
cate this by giving the example the same number as the transitivity, replacing 
t by k, r by x and T by K. 


ki. abe adb. 

k3. abc dab. 

xl. abe adb adc. 

x2. abc adb dbc. 

K3. abc adb xcd. 

K4. abc dab xcd. 

K5. abe adc bxd. 

K7. abc abd cxd. 

K10. abc abd xbc. 

E1. abc dab xcd abx. 
E2. abc dab xcd bex. 

E3. abc dab xcd acx bcex. 
E4. abc adb acx dbc bex. 
ES. abe adb bex adc acx abx adx. 
E6. abc adb xdb adc. 

E7. abe adb bex abx acx. 


The following table of examples completes our existential theory. In entry 
4 we take as the space K the points a, b, c, d, x, a’, b’, c’, d’, x’ with the posi- 
tive relations of example K5 on the points a, b, c, d, x, the positive relations of 
example K7 on the points a’, b’, c’, d’, x’, and the other positive relations re- 
quired by 8. Each case in which the example column contains more than one 
entry is to be treated similarly. We have made no column for Tp. It will be 
found to hold in each of our examples except 32-35, where it must fail because 
of the implication #2: ts- T1s— 71. To secure the example corresponding to those 
listed in which 71» fails we simply adjoin K10 to the example listed. 


| 

| 

| 

H 

! 
= 


EVERETT PITCHER AND M. F. SMILEY 


th ty ts Ti T3 Ts Ts Tx Ts To Example 
1. + + + + + + + + + + Linear order 
2. + - 
3. - + KS 
4, KS, K7 
5. + -+ + + k3 
6. + + + - - Ei 
7. + + - + + k3, K7 
8. + + - - - Ei, K7 
9, - + + - - K4 
10. - + - -—- + E2 
11, - + - - - K4, K7 
12-18. ° as in 5-11 
with K5 
19. - + + + + E3 
20. + + - = K4, «2 
21. + - + + E3, K7 
22. + - — + E2, «2 
23. + - -—- = K4, K7, «2 
24. ’ - - — + E4 
25 EA, K4 
26. - + + + + + - + xl 
27. + + - «il, K5 
29. + - - ES, KS 
30. - + + E6 
31. - + - E6, K5 
32. + ES, E6 
33 - - - ES, E6, K5 
34, - + - - - = «1, k3 
35. «1, k3, KS 
37. + - K3 
38. - + E7 
39. - - Ei, K3 


* In these places arrange + and ~ signs as in the entries 5-11. 


Part II 


We shall devote the remainder of this paper to the study of a generaliza- 
tion of metric betweenness in metric lattices, and to the application of the 
transitivities of Part I both to this relation and to the relation of betweenness 
in semi metric, metric, and metric ptolemaic spaces. 


104 


1942] TRANSITIVITIES OF BETWEENNESS 105 


8. Lattice betweenness. Glivenko [5, 6] proved that in a metric lattice 
an element b is (metrically) between the elemerits a and c if and only if 


(8.1) (@NbU(bNod =b=(@UdbN(bU dc). 


This condition does not involve the metric and we take it as our definition(*) 
of betweenness in an arbitrary lattice L. When b is between a and c we shall 
frequently write simply abc. We shall need the simple and fundamental prop- 
erties of this relation given in the following two lemmas(°). 


Lemma 8.1. If L is a lattice and a, b, cEL, then 

(1) the inequalities a=b<c imply that the relation abc holds; 
(2) the relation abc implies that a(\cSbsaUc; 

(3) both a(\c and ac are between a and c. 


Proof. (1) If asbsc, then = (aVUb) 
(\(bUc). By our definition, abc; and (1) is proved. 
(2) If abc, then = (aUb)M\(bUc) (alc) =al\c. It follows that 
a(\c Sb. Dually, This proves (2). 
(3) Note that =aUc, and also that 
(\((aUc)Uc) =aUc. By definition, a\Uc is between a and c. Dually, a(\c is 
between a and c. This proves (3). 


Lemna 8.2. If L is a lattice then its betweenness relation satisfies a and B. 


Proof. (a) This is an immediate consequence of the commutativity of the 
operations a/\b and aU in lattices. 
. (8) Let L bea lattice containing elements a, b, c for which the relations abc 
and acb hold. We then have b=(aVUb)(\(bUc) and c=(aUc)/\(cUb), and 


It follows that b/\c=b=c. To prove the converse we must show that aac is 
valid in lattices for every pair of elements a, c. It is easily seen that 
(a \c) =a =a. Using duality we see by the definition that'the 
relation aac holds. The proof is complete. 
In addition to these fundamental properties, we now show that lattice be- 
, tweenness possesses the five point transitivity T,. 


THEOREM 8.1. If L is a lattice then its betweenness relation satisfies the transi- 
tivity Ts. 

(*) G. Birkhoff [1, p. 9] also gives a definition of betweenness which applies to partially 
ordered sets and which has all the transitivities of Part I. 


(*) We may also note that abc holds if and only if a/\c$bSa\Uc and (a, b, c)D (see J. von 
Neuman, Continuous Geometries, Princeton Lecture Notes, 1936-1937; and Lemma-10.1 below). 


= 
| 
| 


106 EVERETT PITCHER AND M. F. SMILEY [July 


Proof. Let L be a lattice and consider elements a, b, c, d, x EL for which 
the relations abc, adb, and acx hold. We wish to show that dcx is true. We 
prove first that =c. Notice that c=(a(\c)U (cx); and, by 
Lemma 8.1 (2), that a(\c $b, and a(\b Sd. It follows that a(/\c Saf\b Sd. We 
obtain 

(CN x) = (dN (aNd U (eM x)))U(EN x) 
2 x) 
2 x) U(cN 2x) 


Consequently, =c. Dually, (dUc)(\(cUx) =c. By definition, 
we have dcx. The proof is complete. 


CoroLiary. The transitivities t; and 7; are valid for the betweenness relation 
in every lattice. 


Proof. This is a trivial result of the implications T,—#,;—11. 

9. Interpretations of certain of the five point transitivities for lattice be- 
tweenness. Glivenko [5] showed that a metric lattice is distributive if and 
only if its (metric) betweenness relation has the transitivity which we have 
labeled T;. We shall extend this result to lattice betweenness in this section. 
We shall also prove that both T, and T7-t are equivalent to the distributive 
law; that each of the transitivities 4, and 72 is equivalent to the modular law; 
and that each one of the postulates t;, T;, T2, and T; holds if and only if the 
lattice is linearly ordered. The remaining transitivities do not seem to have 
important lattice-theoretic interpretations. We shall verify that each of them 
fails in the Boolean algebra of eight elements. 

Our first theorem gives the interpretation of the transitivity 4. 


THEOREM 9.1. A lattice L is modular if and only if its betweenness relation 
satisfies the transitivity t2. 


Proof. Consider first a modular lattice L containing elements a, b, c, d for 
which the relations abc and adb hold. We wish to establish the relation adc. 
Note that (a(\b)U U (db) =d, and, by Lemma 8.1 (2), that 
af(\csb, and af/\Wsd. We then obtain =(aNd)U 
Using the modular law, since Sd, we find that 

=(aMNdUdNbN 
sd. 
Hence d = (a(\d)U (d"\c). Dually, d = (aUd)M\(dUc). Consequently, the rela- 
tion adc is valid. Thus the modular law implies the transitivity t.. Conversely, 


4 


1942] TRANSITIVITIES OF BETWEENNESS 107 


the transitivity 4 implies the modular law. To see this, let L be a ‘attice 
whose betweenness satisfies ¢.. If L is non-modular it must contain the sim- 
plest non-modular lattice of five elements shown in Figure 9.1 as a sublattice. 


Fie. 9.1 


Note that, by Lemma 8.1 (3), the relation abc holds in L since b=a\Uc; and 

that the relation adb holds in L by Lemma 8.1 (1). But if the relation adc 

is true, then (a/\d)\/(d(\c) =d. However, we see from Figure 9.1 that 

(af \d)\U (d(\c) =axd. Thus the transitivity fails in L. This is contrary to 

our hypothesis. It follows that the lattice L is modular. The proof is complete. 
A similar result holds for the transitivity 72. 


THEOREM 9.2. If L is a lattice, then its betweenness relation satisfies the 
transitivity 72 if and only if L is modular. 


Proof. If L is a modular lattice, it is clear from the implication 472 and 
Theorem 9.1 that 72 is valid for the betweenness of L. On the other hand, if 72 
holds then the lattice must be modular. Otherwise a sublattice such as we 
have pictured in Figure 9.1 exists. In it we have shown, in the proof of Theo- 
rem 9.1, that the relations abc and adb are true and that the relation adc is 
false. But the relation dbc also holds in the lattice of Figure 9.1, since b =d\Uc. 
Hence the hypotheses of the transivity 72 hold in this sublattice (and therefore 
also in the lattice itself), but its conclusion fails. This is contrary to the as- 
sumption that the transivity 72 holds. Thus the transitivity 72 implies that 
the modular law is valid. The proof is complete. 

We pass now to a discussion of the transitivity T,. Our next lemma, on 
the road to establishing the equivalence of T; and the distributive law, gives 
a relation between Duthie’s segments(*) and our betweenness. 


Lemma 9.1. If L is a lattice then it is distributive if and only if for every 
triple a, b, CEL the inequalities al \c $b Sac imply that the relation abc holds. 


Proof. Consider a lattice L in which the implication of our lemma holds. 
We establish the modular law for L first. Consider three elements a, b, cE L 
with a Sc. Since a(\b \(aVUb) our hypothesis yields that c/\(a\U) 
is between a and b. Whence we have 


(*) Duthie defines.a segment of a lattice L between two elements a, b EL as the set of all 
x EL satisfying a \b<x<a\b. Our lemma has also been proved by him [4]. 


EVERETT PITCHER AND M. F. SMILEY 


(aU b) = (aN (aU U (aU ND) 
= (af\c)\U(cl\b) = (bf \ 0), 


which is the modular law. Now consider elements u, v, wEL. Note that 
(uC\w) w)) S (uw). By hypothesis we obtain that 
2=(ul/\w)U(e/\(uUw)) is between u and w. An easy application of the 
modular law reduces the conditions that this relation hold to the equations 


But this last identity characterizes distributive lattices [1, p. 74]. Conversely, 
if L is distributive and a, b, c are three elements of L for which a(\c Sb) saUc, 
then = b(\(aUc) =), and dually. Thus the relation abc holds. 
This completes the proof. 

We continue with the proof that 7; is equivalent to the distributive law. 


THEOREM 9.3. If L is a-lattice, then its betweenness relation has the transi- 
tivity Ts if and only if L is distributive. 


Proof. Consider a lattice L whose betweenness relation satisfies 7;. By 
Lemma 9.1, L will be distributive provided the relation abc holds for every 
triple a, b, cE L such that a/\c £b Sac. Hence consider elements a, b, cEL 
for which a(\c Sd By Lemma 8.1 (2), b is between and By 
Lemma 8.1 (3), we know that both a/\c and a\Uc are between a and c. Appli- 
cation of the transitivity 7; then yields the fact that } is between a and c. 
Thus the validity of the transitivity 7; implies the distributive law by Lemma 
9.1. Conversely, if L is distributive and the relations abc, adc, and bxd hold 
for elements a, B, c, d, xE€L, then, using Lemma 8.1 (2), we obtain 


Since L is distributive, it then follows from Lemma 9.1 that b is between a 
and c. Hence the distributive law implies that the transitivity 7; holds in L. 
This completes the proof. 

Still another form of the distributive law is provided by the postulate 7,, 
while 77 is equivalent to the distributive law in modular lattices. The next 
three theorems will show this. 


THEOREM 9.4. If L is a distributive lattice, then its betweenness relation has 
the transitivities T, and T>. 


Proof. Let L. be a distributive lattice. We prove first that T, holds for the 
betweenness of L. Consider five elements a, b, c, d, x L for which the rela- 
tions abc, dab, and xcd hold. We wish to prove that the relation abx is valid. 
By Lemma 9.1 it is sufficient to show that a(\x <b<aUx. Lemma 8.1 (2) 
yields that a < bd, and that x(\d Sc. Hence we find that 


108 July 
( 
3 
4 
4 


1942] TRANSITIVITIES OF BETWEENNESS 


Combining with a we have 
= (4N«NbU (alc) S (bN 0). 


But since the relation abc holds. It follows that Sb. 
Dually, aVx2b. By Lemma 9.1 and the fact that L is distributive we then 
know that the relation abx is valid. Thus the transitivity T, is valid in dis- 
tributive lattices. 

To prove that T; holds for the betweenness relation of a distributive lat- 
tice L, consider five elements a, b, c, d, x EL for which the relations abc, abd, 
and cxd hold. We wish to show that we have the relation abx. By Lemma 8.1 
(2), we know that a/\c Sd, a(\d Sb, and that x S$cVU/d. Combining the last in- 
equality with a, we find that Sa/\(cUd) =(a/\c)U (ald) Sb. Dually, 
aVUx2b. It follows from Lemma 9.1 that the relation abx is true. Thus T; 
is valid in distributive lattices. The proof is complete. 


THEOREM 9.5. If L is a lattice whose betweenness relation has the transitiv- 
ity T,, then L is distributive. 


Proof. The implication T;—, proved in §6, together with the result of 
Theorem 9.1 shows that if 7, holds for lattice betweenness in a lattice L 
then L is modular. It is well known [1, p. 75] that every modular non-dis- 


Fic. 9.2 


tributive lattice contains a copy of the simplest modular non-distributive 
lattice shown in Figure 9.2 as a sublattice. Thus if 7, were to hold for lattice 
betweenness in a non-distributive lattice L it would hold in the lattice of 
Figure 9.2. In this lattice we have the relations abc, dab, and xcd since a <b <c, 
a=b/\d, and c=d\Ux. But abx would require that while 
actually Thus 7, fails in this lattice. It follows 
that the transitivity 7, for lattice betweenness implies that the lattice is dis- 
tributive. The proof is complete. 


THEOREM 9.6. If L is a modular lattice whose betweenness relation satisfies 
the transitivity T; then L is distributive. 


Proof. Since L is modular it can fail to be distributive only if it has a sub- 
lattice of the type shown in Figure 9.2. If we reletter the elements of this 


109 


110 EVERETT PITCHER AND M. F. SMILEY [July 


lattice, putting a’ =), b’ =a, c’ =x, d’ =d, and x’ =c, we may verify easily that 
the relations a’b’c’, a’b’d’, and c’x’d’ hold since b’ =a’(\c', b’=a’(\d’, and 
x’ =c’Ud’. If Tz held we should have (a’Ub’)(\(b’Ux’) =b’, while in fact 


a’ UD) Ux’) =a’ 


Thus 7; fails for L. It follows that a modular lattice L cannot fail to be dis- 
tributive when 7; holds. The proof is complete. 

REMARK. An examination of the lattice of Figure 9.1 will show that the 
result of Theorem 9.6 cannot be extended to non-modular lattices. 

Our next theorem discusses the transitivities T7,, T2, T3, and t,. 


THEOREM 9.7. If L is a lattice then its betweenness relation has one of the 
transitivities T;, T2, Ts, ts if and only if L is linearly ordered. 


Proof. Since the transitivities cited obviously hold in a linear order and 
since each of them implies fs, it will suffice to show that the betweenness rela- 
tion of a lattice satisfies ts only if the lattice is a linear order. Hence let L be 
a lattice whose betweenness relation satisfies t;. Consider two elements a, DEL. 
Suppose that none of the relations a<b, a>b, a=b holds. Then clearly 
a and Note that a(\b<a<aUb. By Lemma 8.1 (1), we find 
that a is between ab and a/\b. By Lemma 8.1 (3), a\V/b is between a and b. 
The transitivity ¢; then yields the fact that a is between a(\b and b. It fol- 
lows that : 


contrary to the fact that a(/\ba. Thus, if t; holds for the betweenness of L 
no pair of elements of L can be incomparable. This means that L is linearly 
ordered. The proof is complete. 


Fic. 9.3 Fic. 9.4 


We now show that each of the remaining transitivities, namely, Ts, 7», 
and To, fail to hold in the Boolean algebra of eight elements. To see that 
Tx fails note that in Figure 9.3 we have abc, abd, xbc, and bxd since a<b<c, 
al\d <b<aUd, c<b<x, and x =bUd; but if xbd also held then 8 would re- 
quire that x =b. Figure 9.4 provides a counterexample for T; and T». Using 
Lemma 8.1, we see that we have abc, dab, and xcd since a<b<c, a=d/\b, 


4 

a 


1942] TRANSITIVITIES OF BETWEENNESS 111 


and c=d\Ux. But bex is false since b\/x =x, which does not contain c; and acx 
is false since =x. 

Let us summarize the results of this section and the preceding one in a 
theorem. 


THEOREM 9.8. If L is a laitice then its betweenness relation has the transitivi- 
ties T¢, t1, and 71; it has each of the transitivities t. and Tz if and only if L is modu- 
lar; it has each of the transitivities T, and T, if and only if L is distributive; it 
has the transitivity T; if and only if L is distributive provided that L is modular; 
and it has each of the transitivities ts, T:, Tz, and T; if and only if L is linearly - 
ordered. 


10. Critique of lattice betweenness. A. Wald found a set of properties of 
metric betweenness which characterize this relation in metric spaces [13]. We 
shall devote this section to a proof of an analogous result for lattice between- 
ness. The algebraic structure of lattices permits a slight economy in that we 
may characterize our relation of lattice betweenness in the particular lattice 
considered, while Wald found it necessary to consider a relation R defined in 
every metric space. 

The present form of this section is due in large measure to suggestions of 
W. R. Transue. He, together with one of us, applies the result in a study of 
transitivities of betweenness in metric lattices and their generalizations. 

Our result takes the following form. 


THEOREM 10.1. If L is a lattice and R is a triadic relation defined for all 
ordered(") triples of elements of L, then R is lattice betweenness provided that the 
following conditions hold. 

(i) R satisfies the postulates (a) and (8). 

(ii) R satisfies the transitivity t,. 

(iii) If Sc, then (a, b, c)R. 

(iv) The relations (a, aVUc, c)R, (a, a(\c, c)R hold for every a, cEL. 

(v) If the relation abc holds, then in the sublattice generated by a, b, c the 
transitivity t, holds for R. 


The properties (i)—(iv) have already been established for lattice between- 
ness in Lemmas 8.1 and 8.2 and in the corollary to Theorem 8.1. The follow- 
ing lemma justifies the assumption (v). 


Lemma 10.1. If L is a lattice and the relation abc holds for three elements 
a, b, cEL, then the sublattice generated by a, b, c is distributive. 


Proof. We shall prove, in fact, that the free lattice [1, p. 22] generated 
by three elements a, 5, c for which the relation abc is assumed is given in 


(7) This word “ordered” refers, of course, to the fundamental metamathematical notion of 
“ordered” set. This should not be confused with the order relation in the lattice L. 


| 
| 
a 
| 


112 EVERETT PITCHER AND M. F. SMILEY [July 


Figure 10.1. To prove this, let a, b, ¢ be three elements which generate a lat- 


Fic. 10.1 


tice in which the relation abc holds. We note first that we must then have 
b=(aVUb)(\(bUc). Hence we must also have 


b) Ne) UB) = (@U NADU N (6U 0) 
= (aU b)N(bUe) =b. 


It follows that b2(aVb)(\c. Consequently, and 
hence b(\c=(aVUb)(\c. Using this result we see that (b/\c)U(al\c)= 
(alc) =c/\(aUb) = \c. Interchange of a and in these re- 
sults and their duals justifies Figure 10.1. It is obvious that the lattice of 
Figure 10.1 is distributive since it is the product [1, p. 13 and p. 76] of two 
chains of three elements. The proof is complete. 

Proof of Theorem 10.1. Consider a lattice LZ and a triadic relation R de- 
fined for all ordered triples of elements of L which satisfies the conditions 
(i)-(v) of Theorem 10.1. We prove first that the relation (a, b, c)R implies 
the relation abc. For this implication we need only the conditions (i)-(iv). 
Consider three elements a, 6, cE L for which the relation (a, b, c)R holds. By 
(iv) we have (a, aU), b)R and (ii) then gives (aU), b, c)R. Again (iv) gives 
(b, bUc, c)R and (ii) yields (aU), b, bWUc)R. Note that (aVU/b)M(bUc) 
and apply (iii) to obtain (6b, (aUb)M\(bUc), aUb)R. Combining 
this last relation with (aVUb, 6, bUc)R and using (ii) we find that - 
(bUc, b, But since bUc, (iii) gives 
(b, (aUb)M(bUc), bUc)R. Using (i) we then obtain b=(aVUb)M(bUc). By 
duality, b= (a/\b)\U(b("\c), and we conclude that the relation abc holds. Thus 
the relation (a, 6, c)R implies the relation abc. 

We prove next that the relation abc implies the relation (a, b, c)R. For 
this implication we do not use the condition (ii). In the proof we shall omit ex- 
plicit reference to our use of the condition (i). Let a, b, c be three elements of L 
for which the relation abc holds. By Lemma 8.1 (2) we havea saVUb SaVUbUc 
=a\Uc, and the relations (a, aUbUc, c)R and (a, aVUb, aUbUc)R then fol- 
low from (iv) and (iii), Condition (v) then gives (a, aV/b, c)R. Note that 
c<bUcsaVUbUc. The relations (c, bUc, aUbUc)R and (c, aUbUc, aUb)R 
then follow from (iii) and (iv). Applying (v) we obtain (c, bUc, aU/b)R. Since 
abc holds we have 6=(aVUb)(\(6Uc), and (iv) then gives (bUc, 6, aUb)R. 


| 

il 

4 

a 


1942] TRANSITIVITIES OF BETWEENNESS 113 


Condition (v) then yields (c, b, a./b)R. Combining this last relation with 
(c, aU, a)R and using (v) again we find (a, 6, c)R. Thus the relation abc im- 
plies the relation (a, 6, c)R. 

Combining the results of the preceding two paragraphs we find that the 
relation R holds if and only if lattice betweenness holds, that is, R ts the lat- 
tice betweenness of L. The proof of Theorem 10.1 is complete. 

REMARK. It seems unfortunate that our theorem requires the condition 
(v). That it is necessary to make some such assumption may be seen by con- 
sidering the lattice of Figure 10.1. In this lattice let R be the same as lattice 
betweenness except that the relation (a, b, c)R does not hold. If we could 
prove (a, b, c)R from the assumptions (i)—(iv) of Theorem 10.1, then we 
should have to obtain this result from condition (ii) since the conclusions of 
(i), (ii), and (iii) cannot apply to a triple (d, e, f) with both d and e and f and e 
not comparable. To obtain (a, b, c)R from (ii) would require hypotheses of 
the form (d, b, c)R, (d, a, b)R or of the form (d, b, a)R, (d, c, b)R. But these 
sets cannot hold in our example, since if we have (d, a, b)R and (d, b, c)R, 
then da, and It follows that dU) =w or u and hence that 
d(\b=b, contrary to d(\bsSa. The other set of hypotheses may be treated 
likewise by interchanging a and c. It is possible to give alternatives for the 
condition (v) but we shall not consider them here. 

11. Betweenness in metric, semi metric, and metric ptolemaic spaces. In 
a metric space with distance function 6 one says [3, p. 38] that q is between 
the points p and r in case 5(p, g) +6(g, r) =5(p, r) and p¥qr. It is evident 
that this relation fails to satisfy our condition 8. We suggest that it should be 
modified so as to satisfy 6 by deleting the condition p#q#r which requires 
that the points p, g, r be pairwise distinct. We shall do this and shall write 
pqr for the modified relation, reserving the locution “g is between » and 
r” for the usual relation. K. Menger [9] established the transitivities 4; 
and fz for metric betweenness. His famous example of a “railroad” space [9, p. 
80] was constructed. to prove that the transitivity 7; may fail in metric spaces. 
For the case of a semi metric space [3, p. 38] O. Taussky found that the weak 
transitivity 7; holds for the analogue of metric betweenness. Examples of 
semi metric spaces are easily given in which 7: fails. 

There .has recently been some interest in spaces which are metric and 
-ptolemaic [12, 2], that is, metric spaces in which the three products of the 
lengths of opposite sides of every quadrilateral are the sides of some triangle 
in the euclidean plane. For such spaces L. M. Blumenthal [2] established the 
transitivity ¢;. Thus in metric ptolemaic spaces we have immediately the 
properties T7,-7,, Ts, and T>. It is interesting that T; also holds in such spaces. 
We may see this as follows(*). Let a, 6, c, d, x be five points of a metric 
ptolemaic space which satisfy the relations abc, adc, and bxd. Using the ptole- 


(8) Professor Blumenthal has also noted this fact in a letter to one of us. We are indebted 
to him for a stimulating correspondence during the preparation of this paper. 


i 
‘ 


114 EVERETT PITCHER AND M. F. SMILEY 


maic inequality(*) we obtain ax-bd sab-xd+ad-xb and cx-bd Sbc-xd+dc-xb. 
Adding these inequalities we find 


(11.1) bd(ax+cx) Sxd(ab+bc)+xb(ad+dc)=xd-ac+xb-ac=bd-ac. 


If b =d, then by (1) of §1, b=d =x, and the relation axc is implied by the rela- 
tion abc. If bd, then ax+cx=ac from (11.1) and the triangle inequality, 
and the relation axc is true. As an example of the use of the relation abc in- 
stead of “b is between a and c,” let us give the proof of 7; for the second rela- 
tion. It will suffice to prove that a#x#c. Suppose that a =x. By hypothesis, 
a is then between b and d. The transitivity ¢, then gives (ab !) that a is be- 
tween d and c, which contradicts d between a and c. If x =c, then by hypothe- 
sis c is between a and d. The transitivity ¢, then gives (bc!) that c is between 
a and d, which contradicts d between a and c. 

None of the remaining five points transitivities, namely, Ts, Tz, and Tio, 
holds in every metric ptolemaic space. This may be shown by examples of 
spaces consisting of five points. 


REFERENCES 


1. Garrett Birkhoff, Lattice Theory, American Mathematical Society Otendun Publica- 
tions, vol. 25, 1940. 

2. L. M. Blumenthal, Betweenness in metric ptolemaic spaces, Bulletin of the American 
Mathematical Society, abstract 47-1-66. 

3. , Distance Geometries, The University of Missouri Studies, vol. 13, no. 2 (1938). 

4. W. D. Duthie, Segments of ordered sets, these Transactions, vol. 51 (1942), pp. 1-14. 

5. V. Glivenko, Contributions a I’ étude des systémes de choses normées, American Journal of 
Mathematics, vol. 59 (1937), pp. 941-956. 

6. , Géométrie des systémes de choses normées, American Journal of Mathematics, 
vol. 58 (1936), pp. 799-828. 

7. E. V. Huntington, A new set of postulates for betweenness with proof of complete independ- 
ence, these Transactions, vol. 26 (1924), pp. 257-282. 

8. E. V. Huntington and J. R. Kline, Sets of independent postulates for betweenness, these 
Transactions, vol. 18 (1917), pp. 301-325. 

9. K. Menger, Untersuchungen tber allgemeine Metrik, Mathematische Annalen, vol. 100 
(1928), pp. 75-163. 

10. Pasch-Dehn, Grundlagen der Geometrie, Berlin, 1926. 

11. Gilbert de B. Robinson, The Foundations of Geometry, Mathematical Expositions, 
Toronto, no. 1, 1940. 

12. I. J. Schoenberg, Metric arcs of vanishing Menger curvature, Annals of Mathematics, (2), 
vol. 41 (1940), pp. 715-726. 

13. A. Wald, Axiomatik des Zwischenbegriffes in metrischen Raume, Mathematischen 
Annalen, vol. 104, (1930-1931), pp. 476-484. 

14. W. E. van de Walle, On the complete independence of the postulates for betweenness, these 
Transactions, vol. 24 (1926), pp. 249-256. 


LEHIGH UNIVERSITY, 
BETHLEHEM, Pa. 


(*) For convenience we now write the distance between two points a and 6 of the metric 
space simply as ab. 


4 
A 


ON THE BASIS THEOREM FOR DIFFERENTIAL SYSTEMS 


BY 
E. R. KOLCHIN 


One of the principal points of departure in the study of polynomials and 
polynomial ideals is the Hilbert basis theorem, which states that every set m 
of polynomials in a finite number of indeterminates contains a finite subset 
fi, f. such that 


mC (fi fr). 


As originally proved by Hilbert, this theorem applied to polynomials whose 
coefficients were either elements of a field, or rational integers. In keeping 
with the modern tendency toward abstraction, however, one now finds the 
theorem proved for polynomials whose coefficients are elements of a com- 
mutative ring with unit element in which every set has a finite basis. 

When one turns to differential polynomials and differential ideals one finds 
that the exact analogue of the Hilbert theorem is lacking(*). It is not true 
that every system of differential polynomials 2 contains a finite subset 
Fi, - +--+, F,such that 


ZC [Fi --- 


Instead one is forced to choose as a starting point a weakened analogue, the 
basis theorem of Ritt and Raudenbush. This theorem has been proved for 
differential polynomials in a finite number of unknowns (indeterminates) 
91, °**, Yn With any differential field of characteristic zero as coefficient do- 
main (*), and may be stated in either of the two following equivalent forms: 

1. Every system 2 of differential polynomials has a finite subset Fi, - - -, F, 
such that, for each differential polynomial A € = there is a positive integer ¢ 
such that A‘€[Fi,---, F,]. 

2. Every system 2 of differential polynomials has a finite subset Fi, -- -, F, 
such that 2 is contained in the perfect differential ideal generated by 
F. 


{Fi---,F,}(4. 


Presented to the Society, February 22, 1941; received by the editors July 3, 1941. 

(1) See J. F. Ritt, Differential Equations from the Algebraic Standpoint, American Mathe- 
matical Society Colloquium Publications, vol. 14, New York, 1932, pp. 12-13. 

(*) Square brackets [ ] are used for differential ideals. Parentheses ( ) denote, as usual, 
(algebraic) ideals. 

(?) See H. W. Raudenbush, these Transactions, vol. 36 (1934), pp. 361-368. 

(4) a perfect differential ideal generated by a set is denoted by the set enclosed in 
braces 5 


115 


= 

4 
J 


116 E. R. KOLCHIN [July 


That these two statements are equivalent (when the coefficient domain is a 
differential field of characteristic zero) follows from the fact that the set of 
all differential polynomials some powers of which are in [F,---, F,] isa 
perfect differential ideal (5). 

It is the object of the present paper to generalize the basis theorem of 
Ritt and Raudenbush, as the Hilbert basis theorem has been generalized, to 
permit more general coefficient domains. There is nothing in the literature, 
for example, which allows treatment of differential polynomials with the set 
of rational integers, or a differential field of nonvanishing characteristic, as 
the domain of coefficients. 

An easy counterexample shows at the outset that there is no hope of gen- 
eralizing the first statement of the theorem. In a{y}, the set of all ordinary 
differential polynomials in y with rational integral coefficients, the system 


Pp 


where » is any integer greater than 1, is such a counterexample(®). For, no 
matter what x is, no power of y2,, is contained in [y?, y?,---, y2]. This is 
easy to see since yn41 appears in [y”, v2, - - - , y2] only in terms divisible by p 
or by some yj (Sn). 

On the other hand the second statement of the theorem above is suscep- 
tible of generalization, although not so wide a one as might be expected at 
first blush. A finite subset d1, ---, 6, of a subset ¢ of a differential ring R 
is called a basis of ¢ if 


oC {bi,-+-,b}. 


If every subset of R has a basis we say that the basis theorem holds in R. 
Our main theorem asserts that: 

If Ris a commutative differential ring with unit element, in which the basis 
theorem holds, and if R also satisfies a certain condition termed “regularity,” then 
the basis theorem holds in any commutative differential ring R' obtained from R 
by a finite number of differential ring adjunctions. An example shows that the 
regularity condition is not superfluous. 

The admittance of more general coefficient domains complicates the struc- 
ture of perfect differential ideals and makes it desirable to represent, after 
Raudenbush, the perfect differential ideal {o} generated by a set ¢ as the 
set-theoretic limit of a non-decreasing sequence of sets denoted by {¢},. 
(See §1.) This permits the classification of some bases as 0-bases, 1-bases, 
2-bases, and so on. 

This naturally raises the question whether a set which has a basis has an 


(*) Raudenbush, loc. cit., p. 363. Raudenbush neglects to state that the differential rings 
he considers must contain the rational number system. 
(*) y; denotes the jth derivative of y. 


id 


1942] DIFFERENTIAL SYSTEMS 117 


m-basis for some m. This question is only partially answered below and still 
remains for investigation. If every set in a differential ring has an m-basis 
for some m dependent on the set then we say that the *-basis theorem holds 
in that ring. What we show is that if the *-basis theorem holds in R then the 
*-basis theorem holds in R’ (R, R’ as above). Thus we see that every set of 
differential polynomials in 3{y:,---, yn} has an m-basis for some m. How- 
ever, it is still unknown whether we may put a bound on m. An example 
shows that any such bound would depend on n. 

For the sake of generality the proofs are given for partial differential rings. 
There is a proof for ordinary differential rings which is materially shorter 
and simpler, and which is not a specialization of the partial case. For its own 
interest we present in §11 an outline of this proof. 

1. Perfect differential ideals. Throughout this paper R will denote a com- 
mutative (partial) differential ring with r types of differentiation (or deriva- 
tive operators) 6:,---, 6,. 

A differential ideal ¢ in R is called perfect if o contains an element of R 
whenever it contains some power of that element: a‘€o implies aGo. 

Let ¢ be an arbitrary subset of R. There exists a perfect differential ideal 
in R containing ¢; for example, R itself. The intersection of all perfect differ- 
ential ideals containing ¢ is itself a perfect differential ideal containing ¢, and 
is called the perfect differential ideal generated by ¢; in symbols, {¢}. 

To exhibit the structure of {¢} we define by induction: 

o= (9), 
i¢ n=set of all aGR such that a'€[{¢},-1] for some ¢, n=1, 2, 

Each {¢}, is an ideal. When n>0, {¢}, contains every Worn some 
power of which it contains. Moreover, 


L{d}m}n = min 


The definitions imply that 
={o}o+ {o}it (%. 
2. Bases. A finite subset };, -- - , b, of 6CR is called a basis of ¢ if 
{bi be}. 


The basis will be called an m-basis if 


and 


(7) If R isa differential ring obtained by the differential ring adjunction of a finite number 
of unknowns to a differential field of characteristic 0 then {¢} = {}., as is well known: For gen- 
eral R this is no longer true. For example, if R is the totality of differential polynomials in y 
with rational integral coefficients, we see, because y € { y*}, that 1, the derivative of y, is in 
{x}. Yet 9: because y: appears in [y*] only in terms which are divisible by y 
or by 2. 


. 
. 


118 E. R. KOLCHIN ; [July 


{bi +++, ba} m(*). 


One says that the basis theorem holds in R if every subset of R has a basis. 
If every subset of R has an m-basis, with m depending on the subset, then we 
shall say that the *-basis theorem holds in R. If every subset has an m-basis, 
with a single m independent of the subset, we shall say that the m-basis 
theorem holds in R. 

The basis theorem of Ritt and Raudenbush mentioned above is seen to be, 
in our terminology, a 1-basis theorem 

3. A useful result. Let a be an arbitrary element of R, ¢ an arbitrary sub- 
set of R. Denote the set of all elements af (fE¢) by a-¢. 

We shall show that 


a- {6} m. 


Indeed, since {¢ } o= (¢), the relation in question subsists when m =0. Sup- 
pose it holds for m=k. Let fE {¢},. We show that 


ti 


Indeed, since this relation is obvious for i:+ - ++ +7,=0 it follows in general 
from the fact that 


= a5;(a*g) — [a*g]. 


Thus, a: [{¢}.]&{a-} 1.41. Hence, if gE x41, that is, if [{o},], then 
ag'€ {a-b} x41, {a-b} x41, 80 that a- {ob} qed. 
An easy consequence of our result is that 


4. Maximal subsets(*). Let 2? be a collection of subsets of R such that 
every transfinite sequence ¢; of subsets of R in Mt which satisfies the condition 


also satisfies the condition 


Zo: M. 


We shall prove that 22 contains a maximal subset of R, that is, a PEM 
such that YCM implies PCy. 

Indeed, let ¢; be a well-ordering of I. Define by transfinite induction: 

y,= the first such that for all 7. 

By the construction, no ¢, properly contains every y,. The resulting trans- 
finite sequence ¥, must have a last element. For otherwise 2y, would be a ¢; 
properly containing every y,. This last element is a maximal subset. 


(*) Thus, if m Sn, every m-basis is an n-basis. 
(*) In this section R may be an abstract set. 


4 


1942] DIFFERENTIAL SYSTEMS 119 


5. Systems of differential polynomials of bounded order. We suppose 
henceforth that R contains a unit element 1. 

Let 1, - ++, Yn be unknowns, and let ® be a set of (partial) differential 
polynomials, or forms, in R{ +1, , of bounded orders. We'shall show 
that if the basis (or *-basis) theorem holds in R then @ has a basis (or m-basis, 
for some finite m). 

Proof. Because the differential polynomials in ® are of bounded orders, 
only a finite number of partial derivatives of the y; are effectively present in 
the forms of ®. Let g be the least integer such that there exists a set ®, in- 
volving only g derivatives of the y; which has no basis (or m-basis). By the 
hypothesis on R, g>0. We work toward a contradiction. 

If & is a transfinite sequence of sets of differential polynomials in 
R{ Myer, yn} involving only g derivatives of the y; such that 


C ,, if 


no ®; having a basis (or m-basis), then 2; involves only g derivatives of 
the y; and has no basis (or m-basis). For if 2®; had a basis (or m-basis) there 
would be a single ®; which would contain every differential polynomial of the 
basis, and that ®; itself would have a basis (or m-basis). Therefore, by §4, 
there is a maximal set of forms involving only g derivatives of the y; which 
has no basis (or m-basis). 

Let ® be such a maximal set. Denote the g partial derivatives of the y; 
present in ® by a, , 

It is clear that ® is an ideal in R [au, tee, aq, for otherwise the ideal gen- 
erated by R[ai, - ,@,] would properly contain would involve only 
derivatives, and would have no basis (or m-basis). 

Let ®’ be the set of differential polynomials in ® which are free of a. 

If every element of ®, written as a polynomial in a,, had each coefficient 
in ©’, we would have ®C(®’), so that ® would have a basis (or m-basis), 
because ®’ does. Hence ® contains a form in which a, is effectively present 
and which, when written as a polynomial in a,, has its leading coefficient 
not in ®. 

Of all such differential polynomials let 


B=Ila,+:::, 4, 


be one of minimum degree s in a,. Then, for each GE®, we have, for suit- 
able 


I'G =G’ (B), 
where G’€®@ has its degree in a, less than s(!4). By the minimal nature of 
(°) Rfy:,--+, ya} means the ring obtained by the differential ring adjunction of 
91° ** Into R. 


() Here we use for the first time the fact that R contains a unit element. 


« 


120 E. R. KOLCHIN : . [July 


the degree of B it follows that G’€(®’). But ®’ has a basis (or m-basis) 


D1, +++, Dy. Hence IGE{B, Ds, Du} (or IGE{B, Dy, , Du} m)- 
Now, by the maximality of ®, (J, ®) has a basis (or m2-basis) which we 
may write as J, Du41,--- , D,, where each D;E®. Hence, referring to §3, 
[1G, Duss, +++, Do} {{B, Ds, ++, Du}, Duss, +++, Do}, 
GE {B, Dy, 
$C {B, Dy, 
(or, similarly, { B, Dy, - - - , Dy} m;4+m,). This contradiction completes the 
proof. 


6. Regular differential rings. A differential ring R will be called regular 
if every prime differential ideal rR which contains a prime rational integer 
p is such that the congruence 


a = x? (mr) 


has a solution xER for every aER (that is, if every element has a pth root 
modulo 7). 

If R is of characteristic p >0 then every ideal contains p and no ideal other 
than R itself contains a prime number different from p. 

Examples of regular differential rings are: 

1. every differential ring which contains the rational number system; 

2. every differential ring with unit element of characteristic g>0 in 
which each element has a gth root; 

3. every perfect (“vollstandig”) differential field; 

4. the differential ring of rational integers. 

7. The basis theorem. The theorem we shall prove is the following: 


Let R be a regular commutative differential ring with unit element. Let R’ be 
a commutative differential ring obtained from R by the differential ring adjunc- 
tion of a finite number of elements: R’=R{m,-- +, mn} (2). If the basis (or 
*-basis) theorem holds in R then the basis (or *-basis) theorem holds in R’. 


It is necessary to prove the theorem only for the case in which the 7; are 
all unknowns, »;=¥;; for if the basis theorem holds in Rim, ae yn} then 
it is easy to see that it will continue to hold when any or all of the y; are re- 
placed by elements among which an algebraic differential relation subsists. 

8. The proof begun. Assume that there exists in R’=R{y1, ---, yn} a 
system which does not have a basis (or m-basis for any m). 

If 2; is a transfinite sequence of such systems with 2;C 2, whenever 
<n then the logical sum of the 2; is again such a system. For if the logical 


(#2) The »; may be hypertranscendental over R (for example, they may be unknowns) or 
may satisfy some algebraic differential relation with coefficients in R. 


4 
5 


1942] DIFFERENTIAL SYSTEMS 121 


sum had a basis (or m-basis) then there would be a single 2; which would 
contain every form of the basis, and that 2; itself would have a basis (or 
m-besis). 

By §4 it follows that there is a maximal system which has no basis (or 
m-basis). We let 2 be such a maximal system and seek a contradiction. | 

> is a differential ideal, for [2], like 2, has no basis (or m-basis) and 
therefore can not properly contain 2. Moreover, 2 is prime. To prove this, 
_ assume to the contrary that ABE 2, AE Z, BEL. Then A) and (2, B) 
properly contain 2 and must have bases (or m- and mz-bases, respectively), 
say A, Ci,---,C,and B, Cusi,- C,, respectively, where the C; are in 
Thus 


(2, A)(Z, B) {A,Ci, +++, Cu} {B,Cuss, +++, Co} S{AB,Ci, +++, Co, 


so that {AB, and 2 has a basis (or, similarly, an mz2)- 
basis). - 

9. The proof continued. The object of this section is to show that 2 con- 
tains a prime rational integer p(*). To accomplish this we introduce a set of 
differential polynomials analogous to the “basic sets” used by Ritt. 

We assume that the partial derivatives of the y; are completely ordered 
by a system of marks in such a way that every partial derivative of the y; is 
lower than (precedes) every other derivative of the y; of higher order, and 
if a and 6 are two derivatives of the y; with a lower than 8 then 4,a is lower 
than 5,8,i=1,---, 7. Such an ordering can always be effected("). 

Let o= Z/)R. Clearly o is a prime differential ideal in R. Since the basis 
(or *-basis) theorem holds in R, ¢ has a basis (or m-basis). Hence 2 ¥(c) so 
that 2 must contain forms none of whose coefficients is in ¢. 

Of all the forms in 2 none of whose coefficients is in 7, consider those with 
lowest possible leader a (the leader of a form is the highest derivative of the y; 
effectively present in the form). Of all those forms let A; be one whose degree 
in a is as low as possible. 

Of all the forms in 2 none of whose coefficients is in ¢, which do not con- 
tain a proper derivative (that is, a derivative of positive order) of ai, and 
which are of lower degree in a; than Ai, consider those with lowest possible 
leader a. Of all those let Az be one whose degree in az is a minimum. 

Continuing, at the jth step, consider, of the forms in 2 none of whose co- 
efficients is in 7, which do not contain a proper derivative of a; (¢=1,---,7—1) 
and which are of lower degree in a; than A; (¢=1,---,j—1), those forms 
which have the lowest leader a;. Of all those forms let A ; be one whose degree 
in a; is a minimum. 

Since no a; is a derivative of any preceding a;, there can be only a finite 

(3) If R contained all the rational numbers this would suffice, for then 2 would contain 


1=(1/p) -p, and would have 1 as a basis. 
(4) Ritt, loc. cit., pp. 141-143. 


| 


122 E. R. KOLCHIN ; [July 


number of the a;("), so that the process for defining the forms A; must stop. 
Let A, be the last A;. 

It is easy to see that if Gis a form in 2 which contains no proper derivative 
of any a; and whose degree in each a; is lower than that of the corresponding 
A,, then GE(e). 

Let I; and S; be the initial and separant of A;. The coefficients of J; are 
coefficients of A; and therefore are not in ¢. J; contains no proper derivative of 
any a; and is of lower degree in a; than A; (j=1,--- ,s). Hence 1; 2. 

We shall show that at least one S; is in 2. 

Let no S; be in 2. For an arbitrary form GE = there exist integers gi, h; 
such that 


Oa he 


1;'Si I,'S,'G @G'[Ay, Ae], 


where G’€ = contains no proper derivative of any a; and is of lower degree 
in a; than A;. Thus, by the above, G’€(¢), so that 


91 hy os he 


I; I,5,G = OfA,, As, o], 
1SGE 


Now, 2 is prime and contains no J; or S;, so that 1:S; - - - I,S,€ 2. Hence, 
by the maximality of 2, the system 


2, 1S, 
has a basis (or m-basis) which we may write as 
TS, +++ ++, Be 


Denoting by -- , By a basis (or m2-basis) of ¢, we have 


+ + eS.) ++ 1S. Bi +++, Be} 
++, Bi} S {Ar ++, Avo, Bry +++, Be} 
{Ay,-++, As, By +++, Bu}, 

DC {Ay,--+, As, Bi +++, Bu} 


(or, similarly, 2C -++,A,, Byers, Ba} This contradicts the 
fact that 2 has no basis (or m-basis) and proves that S;€ 2 for at least one 4. 

Let S; be the first S; contained in 2. S; contains no proper derivative of 
any a; and is of lower degree in a; than A; (¢=1,---, 5). Hence S;E(e). It 
follows that n,J;E Z, where n; is the degree of A; in a;, so that njE Z, and 
one of the prime factors p of »; must be in 2. This completes the proof of the 
result at the beginning of this section. 


(4) Ritt, loc. cit., pp. 135-136. 


1942] DIFFERENTIAL SYSTEMS 123 


10. The proof concluded. Let F be any nonzero differential polynomial, 
¥ any partial derivative of the y;. F can be written in one and only one way 


as a polynomial 
Ho t+ + H, # 0, 


in y of degree h <p, where y does not appear in the H; except raised to powers 
divisible by p. We shall call h the p-degree of F in y. The highest derivative 
of the y; in which F has a positive p-degree (if such a derivative exists) shall 
be called the p-leader of F. If y is the p-leader of F and if h is the p-degree 
of F in y, we shall call the coefficient of y* the p-initial of F, and 0F/dy the 
p-separant of F. 

We shall need the fact that 2 contains a form, none of whose coefficients 
is in o, whose p-degree in some derivative of the y; is positive and whose 
p-initial is not in 2. To prove this assume the contrary and let G be a form 
of Z, none of whose coefficients is in 7, of least possible (total) degree. Every 
term of G involves only powers divisible by p, else the p-degree of G in some 
derivative of the y; would be positive and the p-initial of G would be a form 
in 2, none of whose coefficients is in 7, of lower degree than G. Moreover, by 
the regularity of R, the coefficient of each term of G may be replaced modulo a 
by the pth power of an element of R(**). Since pCa it follows that G=H? (c), 
where H is the form obtained from G by replacing each term by its pth root 
modulo oc. H is of lower degree than G and is in 2. This contradicts the defini- 
tion of G and proves the required fact. 

Of all the forms of 2, none of whose coefficients is in ¢, which involve 
derivatives of the y; to a power not divisible by p and whose #-initials are not 
in Z, consider those with lowest p-leader §;. Of all those forms let B, be one 
whose p-degree in f; is as low as possible. 

Of all the forms in 2, none of whose coefficients is in ¢, which involve 
derivatives of the y; to a power not a multiple of », whose p-initials are not 
in Z, which do not contain a proper derivative of 6; except raised to a power 
divisible by p, and which have a p-degree in #; less than that of B,, consider 
those with lowest possible p-leader (2. Of all those let B; be one whose p-degree 
in 62 is a minimum. 

Continuing, at the jth step, of all the forms in 2, none of whose coefficients 
is in ¢, which contain derivatives of the y; to powers not divisible by », whose 
p-initials are not in 2, which do not contain a proper derivative of 6; except 
to a power divisible by » (i=1, - - - , j—1) and which have a p-degree in 8; 
less than that of B; (¢=1, - - - ,7—1), consider those with lowest p-leader 6;. 
Of all those forms let B; be one whose p-degree in 8; is as low as possible. 

As with the A; of §9, the process of defining the B; must stop after a finite 
number of steps. Let B, be the last B,(!”). Let J; and 7; be the p-initial and 
p-separant, respectively, of B;. 

(4) Up to this point we have not used the regularity. Henceforth it will be important. 

(!7) The s here is not necessarily the same as that of §9. 


{ 


124 E. R. KOLCHIN [July 


If G is a form of Z, none of whose coefficients is in ¢, which contains no 
proper derivative of any 6; except raised to powers divisible by » and whose 
p-degree in each §; is less than that of the corresponding B;, then either G 
contains no derivative of the y; that is raised to a power not divisible by 9p, 
or the p-initial of G is in 2. 

From this it can be shown that the 7; are not contained in 2. We already 
know that the J; are not in 2. 

Let a represent the highest derivative of the y; effectively present in 
B,,---, B,(#*). Let 2, denote the totality of forms in 2 which contain no 
derivative of the y; which is higher than a. The forms of 2, are of bounded 
order. 

We shall show that for each differential polynomial GE = there exist non- 
negative integers f; such that 


Assume that this is not so. If G is a form in 2 for which such a congruence 
fails to hold it is easy to see that there is a relation 


where G’ is a form in 2 for which such a congruence fails to hold, which con- 
tains no proper derivative of any 6; except to a power divisible by p, and 
which has, in each §;, a p-degree lower than that of the corresponding B,. 
Of all forms in 2 which fail to satisfy a congruence as above, which contain 
no proper derivative of any §; except to a power divisible », and which have, 
in each 6;, a p-degree lower than that of the corresponding B;, consider those 
with the least number of terms. Of all those forms let G be one with a mini- 
mum (total) degree. Since € 2,, no coefficient of G is in ¢. Hence, either G 
contains only powers divisible by p or the p-initial of G is in 2. Suppose G 
contains only powers divisible by ». By the regularity of R, each coefficient 
of G may be replaced modulo ¢ by the pth power of an element of R. Hence 
G=H? (c), where HE Z, having the same number of terms as G and being 
of lower degree than G, satisfies a congruence as above. But this is impossible 
as then G would satisfy such a congruence. Thus G has a p-leader y and the 
p-initial of Gis in 2: 

G= + Koy’, K,€2. 
Since K, is of lower degree than G, K, satisfies a congruence of the type in 
question. But Ko+ - - - +K,-1 y*~', which is in 2 and has fewer terms than G, 
must also satisfy such a congruence. This is impossible, however, for it implies 
that G itself satisfies the same kind of congruence. 

Thus we have shown that 


(#8) a@ may be higher than §, as the p-degree of each B; in a may be 0, 


i 

4 

. 


1942] DIFFERENTIAL SYSTEMS 


Now, by the result of §5, 2, has a basis (or m-basis), say D1, --- , D;. Also, 
by the maximality of 2, the system 


JiT1 
has a basis (or m2-basis) which we may write as 
JiT1 +++ Digty +++, Du D; 
Hence, by §3, 
{IiTy +++ Digs, ++ Du} {Ds Deh, 
{Di,-++,Du} 


(or, similarly, { D,, Da} This contradiction completes the 
proof of the theorem stated in §7. 

11. Shorter proof in the ordinary case. We sketch in this section a shorter 
proof of the theorem under the assumption that we are dealing with ordinary 
differential rings, that is, differential rings with one type of differentiation. 

Denote the jth derivative of any letter u by u;. 

Of the above proof we take over §§1-6. 

We first show that, when the basis (or *-basis) theorem holds in R and R 
is regular, the basis (or *-basis) theorem holds in Riy} . Assuming the con- 
trary we obtain, as in §8, a maximal system 2CR{¥y} which has no basis 
(or m-basis). 2 is a prime differential ideal. If F were a form in 2 whose 
separant S was not in 2, S- 2 would have a basis (or m-basis), B,, --- , B,, 
for which we could write, for each GE 2, S*G=G'[F], with G’ of order no 
higher than that of F, so that we would have S: 7C { z'}, where 2’ is the 
set of forms of 2 whose orders are less than or equal to the order of F. Also, 
by the maximality of 2, the system 2, S would have a basis (or mz-basis), 
say S, B,41, , B,;. Thus we would have 


{S-z, Basi, B,} { Bi, B,}, 
ZC {Bi,---, Be} 


(or, similarly, 7C { Bi, see, Masa This cannot be, so that every form in 
2 must have its separant in 2. 

Of all forms in 2 none of whose coefficients is in 2 let A be one whose 
(total) degree is a minimum. Since S, the separant of A, is of lower degree 
than A, all the coefficients of S must be in 2. These coefficients are coeffi- 
cients of A multiplied by the exponents to which y, appears in A. (Here q 
is the order of A.) Since = is prime and the coefficients of A are not in 2, 


125 


126 E. R. KOLCHIN 3 (July 


these exponents must be in 2. These exponents have a common prime factor 
PE Z, and we see that y, appears in A only to powers divisible by p. It is now 
easy to see that every derivative of y appears in A only to powers divisible 
by ~; for suppose y; is the y; of highest subscript which appears in A to a 
power not a multiple of . Then the (¢—j+1)st derivative of A would be, 
terms divisible by p neglected, a form in Z whose separant is not in Z, an 
impossibility. 

Now, by the regularity of R, we may replace modulo = each coefficient 
of A by the pth power of an element of R. Hence A =B?(Z), where BE Z 
has no coefficient in 2 and is of lower degree than A. This completes the proof 


for R{y}. 
Proceeding by induction, suppose the theorem has been proved for 
Rin, »-1} (*), Asabove, we find, fora maximal system ZCR{y, 


that the separant of each form of 2 must itself be in 2. This must be true no 
matter how we order the unknowns. Letting A be a form in 2, with no coeffi- 
cients in 2, of minimum degree, we see from the above that each y;; appears 
in A only to powers divisible by a prime rational integer pC Z. As in the case 
of one unknown this leads to a contradiction and completes the proof. 

12. Examples. From the point of view of analogy with the Hilbert basis 
theorem, it might be imagined that the regularity condition imposed in the 
basis theorem above is unnecessary. The following example shows that this 
is not so. 

EXAMPLE 1. Let R be the ordinary differential field of characteristic p>0 
obtained from the field of rational integers modulo p by the differential field 
adjunction of the set of “indeterminate constants” Co, ¢1, Cz, - + - , that is, each 
c; is a letter whose derivative is taken to be 0, and R= 3,(¢Co, c1, c2, ++ ). Lety 
be an unknown and consider, in R{y}, the system ®: 


We shall show that © has no basis. 
Indeed, if ® had a basis we should have, for some k, 
Now, (y2+¢:)1= so that 


But clearly ABE (y?+¢o, , implies that A or BE (y?+co, ---, 
¥e-1+Ce-1). Hence (y?+co, is a prime differential ideal, so 
that 


+ Yeu Cor} = + 6° Yen + 


The are unknowns. The jth derivative of is denoted by 


| | 


1942] DIFFERENTIAL SYSTEMS 


But it is easy to verify that 


ye + ce + co, + 


Let 3 be the ordinary differential ring of rational integers. Let n=2”—', 
where m is any positive integer. The following example shows that the m-basis 
theorem does not hold in 3{4:,- ++, yn}. 

EXAMPLE 2. Let © be the system in 3{y:,---, yn} consisting of the forms 


2 2 2 2 2 2 


© has no m-basis. 
To prove this assume the contrary. Then, for some k, 


Letting yi1=y2=21, Ys=34=22, * Where n'=n/2=2"-*, we 
see, in the differential ring R{ 2, tee, zn}, that 


2 2 2 2 2 2 


Continuing, at each step we reduce the number of unknowns by one half until 
we arrive, in R{ u1 } , at the relation 


tu {2, 0 = (2, , 
This contradiction completes the proof. 


CotumBia UNIVERSITY, 
New N. Y. 


127 
i 


4 


ON THE DERIVATIVES OF FUNCTIONS ANALYTIC 
IN THE UNIT CIRCLE AND THEIR RADII OF 
UNIVALENCE AND OF p-VALENCE 


BY 
W. SEIDEL AND J. L. WALSH 


TABLE OF CONTENTS 


SECTION PAGE 
I. UNIVALENT FUNCTIONS 
6. Inequalities for higher derivatives. 136 
8. Behavior of the first derivative almost everywhere... ........-.scccseecceeeeees 141 
9. Example on the slowness of approach of |f(z)| (1—|s|)*............0.ceeeeeee 146 
CHAPTER II. BOUNDED FUNCTIONS: CONFIGURATIONS Cp 
13. Definition and some properties of Cp.............ccceceecceeeeeeeeeeeeeeeeeees 159 
16. The limit property of D, for continuously convergent sequences.................. 167 
17. limn.« Dp(wn) =0 isa necessary and sufficient condition for limn.. |f(zn)| (1—|2n|)* 
CHAPTER III. BOUNDED FUNCTIONS: INEQUALITIES ON Dp 
18. A preliminary lower bound for 172 
20. A lower bound for the derivative of a circular product.............0eeeeeeeeeees 177 
21. Numerical upper bound for Dy.........cccccccccccccccccscccccccscenscccsccece 180 
CHAPTER IV. FUNCTIONS WHICH OMIT TWO VALUES 
22. Inequalities for Dp(wa) when |f(zn)| is 183 
26. Counterexample for unrestricted functions............00cceseeeeseceeeenereees 193 


Presented to the Society, February 26, 1938; received by the editors July 7, 1941. A short 
abstract of this paper under a slightly different title without details of proofs has already been 
published, Proceedings of the National Academy of Sciences, vol. 24 (1938), pp. 337-340. 


128 


= 
‘ 

; 

4 

7 
4 


FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


CHAPTER V. MISCELLANEOUS 


27. Limit values of analytic functions.................00ccccccceeeecccceceeeerees 195 
28. Extension of Bloch’s theorem..............2..cccccetecccvccceveceeercsennees 201 
34. Some extensions to meromorphic functions. ...............00cceeeeceeeeceeeece 215 


1. Introduction. Various results are known concerning the order of growth 
of the first and higher derivatives of univalent and of bounded functions 
analytic in the unit circle, in the plane of the complex variable z. Among 
these may be mentioned Koebe’s distortion theorem (Verzerrungssatz) in the 
univalent case, and Schwarz’s lemma and the results of O. Sz4sz(') in the 
bounded case. A consequence of these results for a function f(z) analytic in 
| z| <1 is |f’(s)| =O((1 —|z| )-*) in the case that f(z) is univalent and 
| f™(z)| =O((1 —|s| )~-*) in the case that f(z) is bounded. Various distortion 
theorems for bounded univalent functions were found by G. Pick and 
R. Nevanlinna(?). H. Frazer and more recently M. L..Cartwright have ob- 
tained results on the order of growth of p-valent functions(*) in a complete 
form. 

All these investigations, however, fail to give an adequate description of 
the behavior of |f’(z)| (1—||) as |z|—+1 from the interior of the unit circle 
|z| <1. In the univalent case an answer to this question is contained in the 
following result due to J. E. Littlewood without the precise constant involved 
and to A. J. Macintyre(*) in the precise form stated here. 


THEOREM 1. Let f(z) be analytic and univalent in | z| <1 and let it omit there 
the value w. Then, in |2| <1 the following inequality is satisfied: 


Theorem 1 is in fact essentially one form of Koebe’s distortion theorem, 
as we indicate below. 

The object of the present paper is to study in some detail the behavior 
of expressions of the form | f‘»(z)| (1—|s|)” for various classes of functions 
f(z) analytic in the unit circle |z| <1, especially the behavior as |z|—+1. We 
thus obtain results which can be interpreted as new distortion theorems. In 


(?) O. Sz&sz, Mathematische Zeitschrift, vol. 8 (1920), pp. 303-309. 

(?) G. Pick, Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften, Vienna, 
Abteilung IIa, vol. 126 (1917), pp. 247-263; R. Nevanlinna, Oversigt af Finska Vetenskaps So- 
cietetens Férhandlingar, vol. 62 (1919). 

() M. L. Cartwright, Mathematische Annalen, vol. 111 (1935), pp. 98-118, 

(4) J. E. Littlewood, Proceedings of the London Mathematical Society, vol. 23 (1924) 
p. 507; A. J. Macintyre, Journal of the London Mathematical Society, vol. 11 (1936), pp. 7-11. 


129 


130 W. SEIDEL AND J. L. WALSH [July 


particular, the expression | f'(2)| (i- | | 2) is found to be closely connected 
with the radius of univalence, which is now to be defined. 


DEFINITION 1. Let w=f(z) be analytic in |2| <1 and let R denote the Rie- 
mann configuration(s) over the w-plane onto which this function maps the region 
| z| <1. Let wo be an arbitrary point, not a branch point, of R. Then the radius 
of the largest smooth circle (boundary not included) with center at wy and wholly 
contained in R is called the radius of univalence of Rat wo and will be denoted by 
D;(wo). Ata branch point wo of R we define Di(wo) as zero. 


In this definition wo refers to an actual point of R and not merely to any 
point of R whose affix is the complex number wo; the notation D;(wo) is thus 
not fully explicit. The reader will easily verify that the largest smooth circle 
whose existence is asserted in the definition does exist and is unique. 

This terminology differs from that of Montel (*), who uses the term modu- 
lus of univalence for our radius of univalence. A similar comment applies to 
the terminology radius of p-valence which we define in §14. 

Explicit inequalities connecting |f’(z)| (1—|2|*) and D:(w) are obtained 
for the class of functions f(z) univalent in | z| <1 in Theorem 3, Chapter I, 
for functions f(z) bounded in | s| <1in Theorem 3, Chapter II, and for func- 
tions f(z) omitting two values in |z| <1 in Theorems 2 and 4 of Chapter IV. 
Analogous to the inequalities connecting | f’(z)| (1—|s|*) and D:(w) we de- 
termine inequalities connecting | f(s)| (1— | for R=1, 2,--+-, pand 
D,(w), where D,(w) is the radius of the largest p-sheeted circle with center in 
the point w contained in R. For the precise definitions the reader may be 
referred to Chapter II, §§13, 14. We obtain such inequalities on higher deriva- 
tives for the class of univalent functions in Theorem 5, Chapter I, for bounded 
functions in Theorems 1 and 2 of Chapter III, and for the functions omitting 
two values in Theorem 5, Chapter IV. For the detailed analysis of the paper 
the reader is referred to the Table of Contents. 

Applications of the results just mentioned occur throughout the paper, 
particularly in Chapter V. 


CHAPTER I. UNIVALENT FUNCTIONS 


2. Preliminary identities. In the sequel we shall make extensive use of a 
lemma due to O. Sz4sz(’). 


Lemma 1. Let f(z) be a function analytic in the circle | s| <1. Let 


(®) We use the term Riemann configuration on which the function w=f(s) regular in 
|| <1 maps the circle |s| <1 to denote that subregion of the Riemann surface of the inverse 
function of w=f(z) which corresponds to the circle || <1. 

(*) Legons sur les Fonctions Univalentes ou Multivalents, Paris, 1933, pp. 22 and 110. 

(7) O. Sz&sz, Mathematische Zeitschrift, vol. 8 (1920), pp. 306-307. 


4 

4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


(2.1) = 
Then g(&) ts a function regular in || <1 for every value of 2 in | s| <1 and 


+--+ + 
We omit the proof of Lemma 1 and proceed to the proof of 
Lemma 2. Let f(z) be a function analytic in the circle | s| <1. Let 
= 


Then, for every fixed value of 2in | s| <1, g(£) is a function of § regular in | | <1, 
and 


n! (nm — 1)! 
(m — 2)! 


+ (— — | |?)f"(2). 


Let us write equation (2.2) for »=k and allow k to assume the values 
1,2,°++,m: 


(1 — | 2|*)*(z) g®(0) 
(2.4) k! = kl Co 


Let us proceed similarly with (2.3): 


k! k! (k — 1)! 
+ (— — | 


The lemma will be proved if it can be shown that (2.5) is obtained from (2.4) 
by solving the latter system for g™(0)/k! (k=1, 2,---, m). To do that it 
suffices to prove that the matrix of the coefficients of (2.4), 


131 


132 W. SEIDEL AND J. L. WALSH [July 


1 0 
z 1 0 -0 
A= 3? 22 1 | 


and the matrix of the coefficients of (2.5), 


1 0 0 
—Z 1 0 
A’ = — 23 1 


are inverse matrices, or that A-A’=TJ, I being the unit matrix. Now, it is im- 
mediately evident that the elements in the principal diagonal of the product 
matrix are 1, while the element in the &th row and /th column, where k >, is 


(2.6) 
F + 


The sum (2.6) may be written as follows: 


which is zero. The case k </ may be treated similarly. This proves that A-A’ 
is the unit matrix. Thus, Lemma 2 is established. 

3. Littlewood-Macintyre theorem. We proceed to prove Theorem 1; this 
method is different from those of Littlewood and Macintyre. Indeed, form the 
function 


+ 2)/(1 + — f(s) 

(1 — | 2|*)f"(2) 
for a fixed value of z in | s| <1(*). This function is evidently regular and uni- 
valent in |¢| <1 and omits there the value (w—f(z))/(1—|s|*)f’(z). Since, 
furthermore, ¢(0) =0 and ¢’(0) =1, we may apply a well known result(*) of 
Koebe in the theory of univalent functions, according to which 


w — f(z) 1 


4 


(8) The function ¢(¢) plays an important role in the theory of univalent functions, cf. 
P. Montel, Lecgons Sur Les Fonctions Univalentes ou Multivalentes, Paris, 1933, p. 51. 
(*) See, for example, P. Montel, loc. cit., p. 50. 


(3.1) o(5) = 


4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 133 


This proves the theorem. Direct computation shows that the limit is attained 
for the univalent function z/(1—z)? and w= —1/4. Of course Koebe’s theo- 
rem is the special case z=0 of Theorem 1. 

4. Inequalities concerning D;. For the sequel it is desirable to restate 
Theorem 1 in a more geometric form. If we set w=f(z),.the right side of in- 
equality (1.1) attains its least value when w is one of those boundary points 
of the region R onto which f(z) maps the circle |z| <1 which are nearest the 
point w. In that case |w— f(z) | = D,(w), as defined in the introduction, and 
Theorem 1 becomes 


THEOREM 1’. Let f(z) be analytic and univalent in | s| <1. Then the inequal- 
ity 
(4.1) (1 —|2|*)| 4D,(w) 


ts satisfied for all values of 2 in | z| <1, where D,(w) is the radius of univalence 
at the point w=f(z) of the region R onto which f(z) maps the circle |z| <1. 


It may be of some interest to point out a geometric interpretation of the 
left side of inequality (4.1). Denote by p(w) the “inner radius” of R with re- 
spect to a fixed interior point w(!). Then p(w) can be expressed in terms of 
f(z) as follows ‘ 


(4.2) p(w) =| f’(z)| (1 —| 


where z is the point corresponding to w. Inequality (4.1) may, therefore, be 
written in the geometric form (!") 


(4.3) S 4D,(w). 


Theorem 1’ gives an upper bound for | f'(2)| (1— | z| 2). It is desirable also 
to obtain a lower bound for this expression. 


THEorEM 2. Let f(z) be analytic in |z| <1, let 2 be any point of |z| <1, 
and wo=f(%o). Then 


(4.4) D3(wo) | f’(20) | (1 — | 20|*). 


We notice that unlike (4.1), the relation (4.4) holds without any restric- 
tion other than analyticity on the function f(z). Denote by R the Riemann 
surface over the w-plane onto which w=f(z) maps the circle |z| <1. If wo is 


(°) The “inner radius” of a simply connected region R with respect to an interior point wo 
is the radius of the circle on which the region R can be mapped conformally by a function f(w) 
so that f(wo)=0 and f’(wo) =1. Cf. G. Pélya and G. Szeg’, Aufgaben und Lehrsitze, vol. Il, 
Berlin, 1925, pp. 16-21. 

(1) Inequalities (4.1) and (4.3) together with Corollary 2 below were first proved by J. L. 
Walsh, Bulletin of the American Mathematical Society, vol. 44 (1938), pp. 520-523. In the 
same paper the author suggests the use of the present method in the study of higher derivatives 
of univalent functions, which is one of the principal topics taken up in the present chapter. 


E 


134 W. SEIDEL AND J. L. WALSH [July 


a branch point of R, (4.4) is trivial, for in that case both sides of the inequality 
reduce to zero. Otherwise, let 


as) =f 


1+ 


This function is also analytic in | {| <1 and maps the circle onto R. Further- 
more, g(0) = wo. If we denote by { =h(w) the inverse function of w=g(f), the 
function h(w) is defined, regular, and single-valued on R. In particular, a 
suitable branch of h(w) will be regular and single-valued on the single-sheeted 
circle C with center at wo and radius D,(w»). The values which this branch 
assumes in C all lie in the circle |f| <1. Hence, in C: | h(w)| <1, (wo) =0. 
Consequently, applying Schwarz’s lemma 


| hi (w0) | s Dilwe) 


Hence, | g’(0)| 2 D,(wo) and the evaluation of g’(0) in terms of f(z) yields 
(4.4). 
The inequality in (4.4) is eile reducing to an equality when 


212 
Combining Theorems 1’ and 2, we obtain 


THEOREM 3. Let f(z) be regular and univalent in |z| <1, let 29 be any point 
of |z| <1, and wo=f(z0). Then, 


(4.5) D,(w0) S| | (1 — | 20|?) S 


We remark that Theorems 1 and 1’ can be somewhat improved if we as- 
sume f(z) not merely analytic and univalent in | z| <1, but also bounded 
there: | f(2)| <M. Under those conditions the function ¢(z) defined by (3.1) 
is also analytic and univalent there, with ¢(0) =0, @’(0) =1, 


2M 
(1 2/9) 


s 
| | 


Since in <1 omits the value 
w — f(z) 
f'(s)(1 — | 
provided the function f(z) omits the value w, the inequality of Pick(*) yields 


(#*) That is to say, under a smooth map of the region || <1 by a function w=f(z) with 
f(0) =0, f’(0) =1, |f(z)| <M, every boundary point of the image in the w-plane satisfies the 
inequality |w| 2 [—(M?—M)"*}*, See Pick, and R. Nevanlinna, loc. cit. 


i 

4 


“2 FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 135 


2M 
1(Wo) 2 bie | zo |?)1/2 | 20 |?) 2) ] 
4M 1) }*/2 


Di(wo) + 2M 


> | f’ | zo 


It may be noted that as M becomes infinite this last inequality approaches 
the form (4.1). 


5. Applications. From Theorem 3 various corollaries may be immediately 
deduced. 


Coroiary 1. Let f(z) be regular and univalent in |z| <1, {2,} any sequence 
of points in | | <1, and w,=f(z,). Then, a necessary and sufficient condition 
that 


lim | (1 —| = 0 


is that 
lim D;(w,) = 0, 


and a necessary and sufficient condition that |f'(z,)| (1—|z,|) remain bounded 
is that D;(w,) remain bounded. 


Coroxtary 2. Let f(z) be regular, univalent, and bounded in |z| <1, {z,} 
any sequence of points in |z| <1 for which lim,.. |%.| =1. Then 


lim |f’(¢»)| (1—|20|)=0. 


The proof of Corollary 1 follows directly from the inequalities (4.5), while 
Corollary 2 follows from Corollary 1 if one remarks that under the hypotheses 
of Corollary 2 we have D:(w,)—0("*). Another consequence of (4.5) is the 
following : 


Coro.iary 3. Let f(z) be regular and univalent in | 2| <1, let 29 be any point 
of |z| =1. Then there exists a sequence of points {zn} (|zn| <1) converging to zo 
such that 


lim | | (1 — | ) = 0. 


In accordance with Corollary 1 it suffices to firid a sequence {z,} converg- 
ing to Zo for which the points w, =f(z,) satisfy the relation D;(w,)—>0. Such a 


(44) As was pointed out by Walsh (loc. cit.), Corollary 2 may also be proved by Carathéo- 
dory’s method of the conformal mapping of variable regions, cf. C. Carathéodory, Conformal 
Representation, Cambridge, 1932, p. 75. 


136 W. SEIDEL AND J. L. WALSH [July 


sequence may be found as follows. It is well known(") that a univalent func- 
tion has finite limit values on almost all radii. These limit values are bound- 
ary points of the region onto which f(z) maps the circle | z| <1. Choose a 
sequence of such radii r, which converges to the radius joining zo with the 
origin. On the radius 7, choose a point z, (| Z| <1) so near to the circumfer- 
ence |z| =1 that 

Di(wa) < 1/n. 


This sequence {z,} fulfills the necessary requirements. 

6. Inequalities for higher derivatives. We now turn to the corresponding 
study of the higher derivatives of univalent functions. In particular, we shall 
determine upper bounds for expressions of the form 


(6.1) | | (1 — | 20 


It is clear immediately that lower bounds for these expressions in terms of 
D,(w) cannot be obtained even in the case »=2. For the expression (6.1) is 
identically zero for m 22 when f(z) =z. Even for the upper bounds of (6.1) the 
sharp inequalities will now be obtained only in the case n=2, 3. For higher 
values of the corresponding inequalities depend on the assumption of the 
truth of Bieberbach’s conjecture, which up to the present has not been es- 
tablished. 
We begin by proving the following inequalities 


THEOREM 4. Let f(z) be regular and univalent in |z| <1, let 20 be any point 
of | z| <1, and let wo=f(20). Then, 


(6.2) | | (1 — | 202)? 8(| + 2)Dx(wo) 
and 
(6.3) ‘| Go) | (1 — | 20 S 24(| zo |? + 4| zo] + 


These inequalities are sharp, reducing to equalities for f(z) =2/(1+2)? for 
real negative values of 2. 


To prove (6.2) and (6.3) compute the second and third Taylor coefficients, 
bz and 43, of the function (3.1) where we set z=29. By direct computation (or 
by §2, Lemma 2) we find that 


1 wt 0 
2=— i> (1 — | 20|?) — Zo, 
(6.4) 
(1 — | zo] ) Zo(1 — | 20| ) + Zo. 


(*) See, for example, W. Seidel, Mathematische Annalen, vol. 104 (1931), p. 191. 


| 
| 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 137 


Now, according to Bieberbach’s theorem and Léwner's theorem("*) | $2 
and | $3. Hence 


1 


(1 — | 20 |?) — s2 


and 
(6.2’) — | 20 — 280(1 — | zo | 4(1 — | 20 |*) | 


Applying (4.1) we obtain at once inequality (6.2). To obtain (6.3) we use the 
evaluation of b; in (6.4) and write 


1 20) 
6 f'(z0) 


Sf’ (20) 


(1—| — 


s 3 


— | zo |) + 20 


and 


| — — | 20 + (1 — | % 
18| f’(20) | (1 — 
It follows that 


| | (1 — | S | — | 20 — — | 20|°)| 
+ 18] f’(z0) | (1 — | 
2 
S 6| — | — of" (z0)(1 — | to|) | 
+ 6| | (1 — | + 18] | (1 — | 201°). 
Applying now inequalities (6.2’) and (4.1) we obtain inequality (6.3). 
If now Bieberbach’s conjecture concerning the coefficients of univalent 
were known to be true("*), one could write 
(n) 


With the aid of a little algebraic manipulation (see below) this would lead to 
the sharp inequality 


(6.5) | (1 — | S + | co| (1 +| zo | 


which becomes an equality for f(z) =z/(1+2)*? for real negative values of z. 
Unfortunately, however, the inequality |b,| <n has been proved only for 


(#5) L. Bieberbach, Sitzungsberichte der Kéniglichen Preussischen Akademie der Wissen- 
schaften zu Berlin, vol. 38 (1916), pp. 940-955; K. Léwner, Mathematische Annalen, vol. 89 
(1923), pp. 103-121. 

(*) See, for example, L. Bieberbach, Lehrbuch der Funktionentheorie, vol. 2, 2d edition, 
1931, p. 80, Footnote 4. 


$2 


138 W. SEIDEL AND J. L. WALSH [July 


n=2 and 3, so that the validity of inequality (6.5) has been established for 
n=2 and 3 only. Weaker inequalities have actually been proved by various 
authors, in particular, J. E. Littlewood(!7) who showed 


(n)(Q 
| < en 


(6.6) 
n! 


and E, Landau(!*) who showed 
(n) 0 1 1 
n!} 2 
Making use of (6.6) and Lemma 1 of §2 we find 
2\n| f(n) 
(1—|2| @I a 
n! 


n—1 (0 
n — v)! 


and using (4.1) 
(1 — | 


n! 


Since, however, 

= (1+ |2|)» 
and 


n—1 


we obtain 
(1 — | f(z) | 4e-mWDy(w) [m(1 + | + — 2] (1 +] 
and finally 
(1 — | |2)"| | 4e-ml(m + | 20| +| 20| wo). 
This clearly is not a sharp inequality. We thus obtain 


THEOREM 5. Let f(z) be regular and univalent in | s| <1, let zo be any point 
in |2| <1, and let wo=f(z0). Then 
(6.7) | (1— 4e-m1(| + m)(1 + 

From this inequality we obtain again two corollaries analogous to those 
of Theorem 3. 


(7) J. E. Littlewood, loc. cit., p. 498. 
(#8) E. Landau, Mathematische Zeitschrift, vol. 30 (1929), p. 635. 


4) 

pal 

| 

f 

> 

4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 139 


Coro.iary 4. Let f(z) be regular and univalent in |z| <1, {2,} any sequence 
of points in | z| <1 and w,=f(z,). Then, if 


lim D,( wn) = 0, 


all the derivatives of f(z) will satisfy the relation 
lim | f‘*(z,)| (1 — | 2n|)* = 0, k = 1, 2, 3,--- 
Clearly the converse of the theorem is false since taking f(z)=z, z,=0 
(n=1,2,---), we have f(z,) =0 for all k22 and all while D,(w,) =1. 


Corotary 5. Let f(z) be regular, univalent, and bounded in |2| <1, {zn} 
any sequence of points in |z| <1 for which limn.« || =1. Then 


lim | f‘(z,) | (1 — | )* = 0, k= 
7. Applications. A few remarks concerning Theorem 3 will now be made. 
Koebe’s “Verzerrungssatz” can be written in the form(?*) 
1— | 1+ | 
(1+|s|)? —|s|)* 


If we combine this inequality with (4.5) we obtain 


We may state this result as follows: 


Coro.iary 6. Let f(z) be regular and univalent in |2| <1 with f(0)=0, 
f'(O) =1, let 2o be any point of | 2| <1, and let wo=f(2o). Then the radius of uni- 
valence D;(wo) at the point wo satisfies the inequality 


1 /1—|20|\? 1+| 20|\? 

The lower bound of Di(wo) was obtained in less precise form by W. E. 
Sewell(?*). The first inequality is sharp, becoming an equality for f(z) 
=2z/(1+2)? along the positive real axis. The second inequality is probably 
not sharp. 

Another application of Theorem 3 concerns infinite regions. Suppose that 


R is a simply connected region of the w-plane for which w= © is an accessible 
boundary point, let 


(#*) See, for instance, Paul Montel, loc. cit., p. 52. 
(2°) W. E. Sewell, these Transactions, vol. 41 (1937), p. 90. 


| 


W. SEIDEL AND J. L. WALSH 


lim sup D,(w) = D, 


where w is an interior point of R, and let w=f(z) map R on the interior of 
the circle || <1; suppose that s=a, (|a| =1), corresponds to w= ©. From 
Theorem 3 it follows that 


lim sup | f’(z) | (1 — | 2 |?) S 4D. 


For an arbitrary infinite region the relations w, =f(z,)—> ©, lim supn..D1(wn) 
=D, lim inf, ..D:(w,) =d clearly imply that lim SUPn-0|f’ (Zn) | | zn|?) 
lim inf, .a|f’(2n)|(1—|2|*) 2d. 

The final remark concerns an inequality derived by G. Szegé(*) on the 
difference quotient of a univalent function. His inequality is as follows: Let 
f(z) be regular and univalent in | z| <1, let 2; and z be any two points of the 
circle | z| <1. Then, 


(21) I (22) 
— 2 


21 


| 1 — | 


— +| 1 — )? 


7.1 
s 


S | | (1 — | (] 


— —| 1 — |)? 


Let us introduce the non-euclidean distance p(2:, 22) between the points 
2, and 2 by means of the following relations 


1+r 21 — 22 
p(z1, 22) = log r= 
1-r 1 — 
By virtue of (7.1) and (4.5) we obtain the inequalities 


(| 21 — 22| +| 1 — — 22 


D,(we) 


| 1 — | 


4D,(w. 
"Us 


where w.=f(z). In terms of p(z:, 2) the inequalities become 
(7.2) (1/4)Dx(we)(1 — — | S — 
From the inequalities (7.2) we obtain the corollary: 


Coro.ary 7. Let f(z) be regular and univalent in |z| <1, let {z,} and {2 } 
be two sequences of points in | z| <1, such that p(Zn, 2.) is bounded and let 
w, Then limn.« )| =0 if, and only if, 

lim — 1)D,(w,) = 0. 


(#2) G. Szegé, Mathematische Annalen, vol. 100 (1928), pp. 190-191. 


140 July 
za 
— 
4 
P 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 141 


8. Behavior of the first derivative almost everywhere. Corollary 2 may be 
stated as asserting that for a regular, univalent, and bounded function in the 
circle | | <1 the first derivative is of order 0((1—7)—') on all radii of the circle. 
The next theorem shows, however, that this order of growth can be attained 
only on a small number of radii and that on most radii the order of growth is 
considerably smaller. Indeed, we prove the following 


THEOREM 6. Let f(z) be regular and univalent in the circle |z| <1. Then 
(8.1) lim | f’(z)| (1 —|2|)2 =0 


for all points e** of the circumference | z| =1 with the exception of at most a set 
of measure zero, where 2 in the above limit is taken in any angle less than x with 
vertex in e** and bisected by the radius joining z=0 with z=e**. Furthermore, in 
any such angle the above limit is uniform. 


The proof depends on a number of lemmas. 
Lemna 3. If f(z) is univalent in the circle |z| <1, then on almost all radii 
(8.2) | | = —| 2|)-"”), 


where the symbol O does not necessarily indicate uniformity for the different radii. 
The relation (8.2) holds also in any angle of the type described in Theorem 6 
which corresponds to a radius for which (8.2) holds. 


If we set w=f(z), then the function maps | z| <1 on a simply connected 
region R of the w-plane. Now, this region R possesses at least two distinct 
boundary points w=a and w=), (a+b). Indeed, if R were the entire plane 
then the inverse function z=g(w) of w=f(z) would map the plane on the in- 
terior of | z| <1. It would, therefore, be bounded in the whole plane and by 
Liouville’s theorem be identically a constant, which is contrary to our as- 
sumption. If R were the whole plane with the exception of one point, w=a, 
then g(w) would be regular and bounded in the whole plane with the exception 
of the one point, w=a. This point, by Riemann’s theorem, would be a remov- 
able singularity, and again z= g(w) would be identically constant. Now, by a 
familiar argument the function 


i= = X(w), 
((w — a)/(w — 


where the constant c is suitably chosen, maps the region R conformally on a 
bounded region of the ¢-plane. 
The function 


h(z) = A(f(2)) 
is regular, univalent and bounded in |z| <1. Let us suppose that Lemma 3 


° 


142 W. SEIDEL AND J. L. WALSH [July 


has already been proved for 4(z). Then, it will also hold for f(z). Indeed, 
h'(z) 

Since we have assumed that lim sup,.,ia | h’ (z)| (i- | | )/2< © for almost all 

points z=e* on | z| =1, where z lies in corresponding angles as described 

in Theorem 6, the asserted lemma will follow for f’(z) provided that 


lim inf,.,ia |r’ (f(z))| >0 for almost all e in the corresponding angles. But 
now 


= 


—b [A(w) ]? 


(w — — 


which shows that lim inf,.,i« |d/(f(z))| =0 only if there exists a sequence of 
points z,—e for which f(z,)—~b or f(z,)—a. This, however, can only 
happen for a set of e* of measure zero(??). 

It suffices, therefore, to prove Lemma 3 for a bounded univalent func- 
tion f(z). Now, w=f(z) maps the circle |z| <1 on a bounded region of 
the w-plane. Denote the area of this region by A. We have, setting z=re*, 


a 
N(w) = 


8.3 "(re*) dr d8 <A 
(8.3) J | dr dd < 
for every 0Sp<1. The function 
8.4 = “sly 2d 
(8.4) (2) a[f’(z) 


is regular in |z| <1 and we shall perform the integration along the radius 
joining z=0 and z=re so that 


0 


Hence, 


| ©(pe'®) | s fo |2dr. 


Integration of the last inequality with respect to 6 together with (8.3) yields 


2n 
(8.5) f | | dd < A 


for every 0Sp<1. 
Now it is a familiar fact that if a function ®(z) is regular in |s| <1 and 


(#) F. and M. Riesz, Compte Rendu du Quatritme Congrés des Mathématiciens Scandi- 
naves, 1920, pp. 28-30. 


1942] 143 


FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


satisfies the condition (8.5) it may be represented in the following form(*): 


1 2r pit 


where the integral is a Stieltjes integral, u(¢) is a function of bounded varia- 
tion in the interval 0 S¢<27 and 8 is a constant. Equations (8.4) and (8.6) 
permit us to express [f’(z) ]* in the form 


1 eit 


Hence, 
an 


where M(t) denotes the total variation of the function pu(¢) in the interval 
(0, ¢). The right-hand side of this inequality approaches a definite finite limit 
as s=re‘’— +e in an angle of the type described in Theorem 6 for almost 
all e*«(?*). Hence, the right-hand side remains bounded in such angles. Thus, 


(8.8) S$ Ca-(1 —| 2] 


in the angular neighborhood of almost all points e*, where C, is a constant 
independent of z, but in general depending on a. This proves the lemma. 


Coro.iary 8. Let w=f(z) be regular and univalent in |z| <1. Then, for al- 
most all points e** on | z| =1 every line segment joining an interior point of 
| z| <1 with eis mapped on a rectifiable arc by the function w=f(z). 


' This follows readily by integrating (8.8) along such a line segment(*). 
If we restrict ourselves to radial approach in Corollary 8, it is possible to 
state a sharper result which will be used in the proof of Theorem 6: 


Lemma 4. Let w=f(z) be regular and univalent in |2z| <1. If 
1 
(8.9) Lo = f | f’(re*#) | dr, z= re? 


then for almost all values of 0 in 030322 and for all values of p in 0Sp<i1, 
1,0 ts finite and 


(8.10) lim J,,6(1 — p)~!/? = 0. 


() See, for example, R. Nevanlinna, Eindeutige analytische Funktionen, Berlin, 1936, p. 
185. 

(*) After integration by parts the integral in (8.7) becomes one of the type considered in 
Carathéodory’s proof of Fatou’s theorem. Cf. L. Bieberbach, Lehrbuch der Funktionentheorie, 
vol, 2, 2d edition, (1931), pp. 148-151. 

(#8) The corollary includes a result stated by M. Lavrentieff, Physico-Mathematical Insti- 
tute of Stekloff, vol. 5 (1934), p. 207. 


144 W. SEIDEL AND J. L. WALSH [July 


The formula in (8.9) represents the length of the image of the radial seg- 
ment joining the points pe and e**. 

One may assume without loss of generality, for the same reasons as in the 
proof of Lemma 3, that f(z) is bounded in | | <1. Then, inequality (8.3) holds 
for some A. The total area A of the image of |z| <1 is given by 


1 
f f | dr 
0 0 
Hence, by Fubini’s theorem, for almost all in 0S 0527 
1 
f r| |%dr 
0 
has a finite value. Hence, for almost all 6 
1 
lim f r| |*dr = 0. 
Thus to any €>0 one may assign a number 6= 6(¢, @) so that 1 -p < 6 implies 
1 
f r| f’ <e 
for almost all 6. Hence, by Schwarz’s inequality 


for almost all and 1>p>1-— 6(e, 0). This proves (8.10). 
Using this lemma, one can now prove 


LemMMA 5. Let w=f(z) be regular and univalent in | | <1. Then on almost all 
radi 
where the symbol o is not intended to indicate uniformity for the different radii. 


We know that on almost all radii (8.2) and (8.10) hold and lim, .1 f(re*) =w 
exists and is finite(*). Choose any one of these radii = 6» and on it an arbi- 
trary point 29. Let f(%o) = wo. The segment of the radius between the points 2 


(®) For the proof of the last statement one need merely apply the fact that the integral in 
(8.9) remains finite for almost all 6. Indeed, take any such 69. Then, 


and the last integral may be made smaller than any preassigned «>0 provided that 7 and fa 
are both chosen sufficiently near unity. 


| 
4 
| 
| 
] 
as 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 145 


and e**¢ is carried into a rectifiable arc joining the points w» and w. Its length 
l,, is given by 


where by (8.10) 
(8.12) lim ¢, = 0, 

Ze 0 


the approach being taken radially. Now draw a circle K,, about the point zo 
as center with radius equal to 1 — | z9|. The interior of the circle K,, is carried 
by w=f(z) into a region R,, of the w-plane. 

According to Koebe’s “Verzerrungssatz” the region R,, contains the circle 
| w—wo| <((1—| 20] )/4) | f’(20)|. 


Now if we set 
| f’(z0) | = C.,(1 — | 20| 
according to (8.2) C,, is bounded along the radius @=6. Thus R,, contains 
the circle | w—wo| <(1/4)C,, (1-— | zo| )/2, In view of (8.11) this may also be 


written | w—wo| <C,,J,,/4€.,. Denoting by p,, the radius of this circle, we have 
on the one hand 


wate 
Pr hen, 40 
and on the other p,,S/,,. Hence, 
Cup 


Together with (8.12) this implies that 
lim C,, = 0 


with radial approach. This proves the lemma. 
We are now ready for the proof of Theorem 6. Let 6=@» be a radius for 
which (8.2) holds in any angle as asserted in Lemma 3 and also 


(8.13) lim | f’(re*) | (1 — 1)? = 0. 


By Lemmas 3 and 5 the set of such 6» is of measure 27. 
Consider the function 


g(z) = f’(2) (ef — 2)*/2, 


where we choose that branch of the square root which is positive for real posi- 
tive values of the radicand. This function is regular and single-valued in 
|z| <1. Now, take a fixed angle of opening less than 7 with vertex in e#%, 
In this angle 


4 

} 


146 : W. SEIDEL AND J. L. WALSH [July 


for a suitable positive constant M. Hence, by (8.13) 


lim g(re%) = 0, 
r—1 


while by (8.2) the function g(z) is bounded in the fixed angle. By Lindeléf’s 
theorem (27) lim,_,i) g(z) =0 uniformly in every angle contained in the fixed 
angle. This proves the theorem. 

9, Example on the slowness of approach of |f‘*)(z)| (1—|z|)*. We have 
shown in Corollary 2, §5, that if the function f(z) is bounded and univalent 
for |z| <1, and also under various alternative conditions, then we have 


(9.1) im f'(en)(1 — | |) = 0, | <1. 


Even for the class of bounded univalent functions, continuous in |s| $1, 
equation (9.1) cannot be improved by establishing results on rate of approach 
in equation (9.1) or by replacing the second factor by that factor raised to a 
suitable power. Indeed we shall prove that the limit in (9.1) can be ap- 
proached arbitrarily slowly, in the sense of 


THEOREM 7. Let the function Q(r) be defined and positive for 0<r<1, with 
lim,.1 Q(r) =0. Then there exists a function F(z) analytic and univalent interior 
to y: |s| =1, continuous for |2| <1, and there exists a sequence of points 
Zi, interior toy with |2,| such that we have 


F'(z,)(1 — | Zn | ) 
1m = 
Q(| Zn | ) 


In fact, we shall choose F(z) real for real 2, and 2z,, real. 


(9.2) 


As a matter of convenience, we establish first Theorem 7 and then an 
extension of Theorem 7 to higher derivatives. The ensuing proof is given in 
preparation for the more general theorem, and is somewhat more complicated 
than is necessary for the proof of Theorem 7 alone. 

We shall find useful a function analytic and univalent for |z| <1 whose 
Taylor expansion about the origin has all of its coefficients positive. Such a 
function is 


w, = fi(z) = 


which maps the region | s| <1 smoothly onto the w;-plane slit along the axis 
of reals from —1/4 to — ©. The function 


(?7) E. Lindeléf, Acta Societatis Scientiarum Fennicae, vol. 46 (1915). 


1 =| ‘ 
—<—— < M 
M~ 
- 
4 
th 


FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 147 


p 
then maps | s| <1 smoothly onto a Jordan region(**) symmetric in the axis of 
reals. For definiteness we choose p= 1/2, and denote by J» the Jordan region 
of the w-plane which is the image of | s| <1 under the map(**) 


2 3 4 


Construct in the w-plane new Jordan regions Ji, J2,--+ with the same 
shape and orientation as Jo, mutually exterior and exterior to Jo, with the 
analogue B, for J; of the point w=0 for Jo lying on the axis of reals, so that 
the sequence By=0, B,, B,,- ++ forms a monotonically increasing sequence. 
Choose moreover the region J; just (1/2*)th the size of Jo in linear dimen- 
sions, and locate (as is possible) the sequence of regions J; in such a way that 
their totality lies in some circle | w| <D. 

The region J» is symmetric in the axis of reals, so its boundary (an analytic 
Jordan curve) cuts that axis in precisely two points Apo (to the left of the 
origin) and Cp» (to the right of the origin). Denote the analogous points for J; 
by A; and C;. The boundary of J; has a vertical tangent at both A; and C;. 

A Jordan region R is to be constructed in the w-plane from the regions 
Jo, Ji, Jo, +++ by connecting each region to the preceding region by a canal; 
each of the two banks of such a canal shall be a segment of one of the lines 
y=+d;, d,>0. Each point interior to J; shall lie interior to R. The first 
canal, whose boundaries are segments of y= +d:, joins Jy in the neighbor- 
hood of Co with J; in the neighborhood of A; the second canal, whose bound- 
aries are segments of y= +d, joins J; in the neighborhood of C; with J2 in 
the neighborhood of As, and so on. The choice of the numbers d; is now to 
be made more precise. 

Denote by w= F(z) the function which maps | s| <1 onto Rwith F(0)=0, 
F’(0) >0; of course F(z) depends on the numbers di, ds, - - - . Choose d; in- 
dependently of dz, ds, - - - so small that the subset R; composed of all points 
of R not in Jo corresponds under the transformation w= F(z) to a set of 
points z interior to y: |s| =1 at which we have 


(9.4) Q(|2|) < 1/3. 


(?8) A Jordan region is any region bounded by a Jordan curve. 

(?*) It is sufficient for the purpose of both Theorem 7 and Theorem 8 to choose here a func- 
tion Fo(z) which maps | <1 smoothly onto a Jordan region with F,(0) =0, Fo(0) = 1, and has all 
of the coefficients of its Taylor expansion about the origin positive. For instance we may also 
choose 


22 1 1 


which maps || <1 onto the interior of the circle | w—2/3| =4/3. 


1942] 
| 
{ 
4 
i 
1 
4 
| 
| 
4 
j 
j 
{ 


148 W. SEIDEL AND J. L. WALSH {July 


Such choice of d; is possible. For under the map w= F(z) it follows from 
a theorem due to Lindeléf(**) that the subset Ri is mapped into a set 
bounded in part by an arc of y and whose remaining boundary (a Jordan arc) 
can be made as near to y as desired. For the boundary points of R; not 
boundary points of R are the points of the boundary of J» in the neighbor- 
hood of the point Cy between the lines y= +d; by choosing d; sufficiently 
small all such points can be made uniformly as near as desired to the boundary 
of R; so by Lindeléf’s theorem all points of the boundary of the transform of 
R, (and hence all points of the transform of R, itself) can be made as near to y 
as desired, and (9.4) is justified. 

Similarly the number d: is to be chosen so small that all points of R not 
in Jy or J; or in the canal joining J) and J; correspond under the map w= F(z) 
to points interior to y at which we have Q( | | ) <1/9; more generally the num- 


ber d; is to be chosen so that all points of R not in Jo: Ji, +--+, Je-1 or in the 
canals joining successive regions Jo, Ji, - - - , J:-1, correspond under the map 
w= F(z) to points interior to y at which we have 

(9.5) Q(| < 1/3*; 


such successive choice of the numbers d; is possible, again by Lindeléf’s theo- 
rem. There are no further restrictions on the numbers d; so far as the require- 
ments of Theorem 7 itself are concerned. We now introduce the inner radius 
p(wo) of the region R with respect to the arbitrary point wo of R(**). It is well 
known that p(wo) has a monotonic character with respect to R: if R is in- 
creased so also is p(wo); if R is stretched uniformly in the linear ratio 1:m with 
wo fixed, then p(wo) is multiplied by m; if R is the interior of a circle with 
center at wo, the inner radius is the usual radius of this circle. 

The inner radius of R with respect to the point B, is greater than 1/2*, 
for it follows from (9.3) that the inner radius of J) with respect to By is unity, 
so the inner radius of J, with respect to B, is 1/2*. On the other hand, 
if z, denotes the point of |z| <1 which corresponds to the point B, under 
the transformation w= F(z), the inner radius of R with respect to B, is 
| so we may write | From in- 
equality (9.5) we have Q(| zx] ) <1/3*, whence 


| F’(2x) | (1 —| |?) _ 3* 


9.6 — 
(9.6) sal 


from which (9.2) follows(#?). 


(®) Acta Societatis Scientiarum Fennicae, vol. 46 (1915). Or see Walsh, Interpolation and 
Approximation, $2.1. In applying Lindeléf’s result it is essential to notice that the region R is 
bounded independently of the numbers d,. 

(#) Cf. §4, Footnote 10. 

(#) In the proof of Theorem 7 we might equally well have used an example due to Szegé, 
Mathematische Zeitschrift, vol. 23 (1925), pp. 45-61; pp. 57-59. Szegé does not mention the 


5 
a 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 149 


Under the present circumstances the region R is symmetric in the axis 
of reals, the numbers 2 are real, and F’(z) is positive, so the absolute value 
signs may be removed from (9.6). Of course F(z) is continuous in | s| s1 
(when suitably defined on | s| =1), as the mapping function for a Jordan re- 
gion. The points B, are real and positive and approach the boundary of R, 
so the points z are real and positive and approach the point z=1. 

Theorem 7 shows that the limit in (9.1) can be approached arbitrarily 
slowly; by virtue of §4, Theorem 3, we may also say that limj,,)-1 D:[f(zn) | 
considered as a function of 1—|z,| can also be approached arbitrarily slowly. 

We now consider the generalization of Theorem 7 to higher derivatives: 


THEOREM 8. Let the function Q(r) be defined and positive for 0<r <1, with 
lim,.1 Q(r) =0. Let the positive integer m be given. Then there exists a function 
F(s) analytic and univalent interior to y: || =1, continuous for'|z| <1, and a 


sequence of points 21, 22,°°+ interior to y with | za| =r,—1, such that we have 
F‘™)(z,)(1 — | )™ 
ad | Zn | ) 


Indeed, we shall choose F(z) real for real z, and 2, real. 


In the proof of Theorem 8 we use precisely the region R introduced in the 
proof of Theorem 7, with further restrictions on the numbers d;; the function 
F(z) is, as before, the mapping function. 

It follows from equation (9.3) that the function 


2 3 + 
(9.8) w= Fi) = b+ | 
maps |z| <1 onto the region J; in such a way that the point z=0 corresponds 


to the point B,: w=, with the axis of reals in one plane corresponding to 
the axis of reals in the other plane. The function 


(k) (k).2 (k) 
=b 
i =) a S+ae § +a, + 


where maps | {| <1 onto R so that { =O corresponds to the point B, 
with the axis of reals in the one plane corresponding to the axis of reals in 
the other. When d; and d;41 approach zero, the kernel in the sense of Cara- 


(9.9) w(t) = F( 


property (9.2), nor does Sewell, but the latter (these Transactions, vol. 41 (1937), pp. 84-123) 
mentions for Szegé’s region the relation (notation of §1) limy.. Di(ws)/Q(|#s|) = ©, we= Fes), 
which by virtue of the inequality | F’(en)| (1—| 2 Ds (we) implies (9.2). Szegé’s example 
does not seem to apply at once to higher derivatives. 

The method of proof of Theorem 7 has also been employed by Walsh, Bulletin of the Ameri- 
can Mathematical Society, vol. 46 (1940), pp. 101-108, for a somewhat different purpose. 


4 
3 
| 
| 
q 
f 


150 W. SEIDEL AND J. L. WALSH [July 


théodory (**) of the variable region R, considered with B, as central point 
(that is, Aufpunkt) is precisely the region J;. It follows from the results of 
Carathéodory (loc. cit.) that the corresponding mapping function w({) defined 
by (9.9) approaches the function F;,({) defined by (9.8), throughout the in- 
terior of |¢| <1, uniformly on any closed point set interior to | {| <1. Indeed, 
such uniform approach of w(f) defined by (9.9) to Fi(£) is a consequence 
of the approach to zero of dj and di41, independently of the behavior of 
di, do, +++, Art, Otherwise there would exist a sequence 
of sequences of numbers di, d2,--- with d, and di: approaching zero and 
the corresponding function w({) in (9.9) not approaching F;({) as defined by 
(9.8); this is impossible. Thus the coefficient a”, considered as a function of 
and d;., alone, approaches the corresponding coefficient 7/24+*-!, 

The inner radius p(b,) of R with respect to the point B; is greater than 
1/2*, so in (9.9) we have 

(k) 


(9.10) a >it. 


We have already made restrictions on the numbers d; in connection with 
Theorem 7. We now impose the further restriction that d;, dz, +--+ are to be 
chosen in pairs (di, dz), (d2, ds), (ds, ds), - - - successively so small that we al- 
ways have the inequalities (k=1, 2, 3,--- ) 
this choice of the d; is possible. We have no other restrictions to be placed 
on the numbers d,. 
By Lemma 1 of §2 we now have 
(1 — | » wim—(0) 
(m — v)! 


where w({) is defined by (9.9). Inequalities (9.11) and (9.10) now yield 
(2, = 2, >0) 


m—1 
= 


2.m™m 

(1 — | | m=1 (k) Zk 
(s:)>% > k>41, 

m! 


so, as in (9.6), we write from (9.5) 
(1 — | (24) 
When k becomes infinite, the point 2, approaches the point z= 1, so equation 


(9.7) and Theorem 8 follow. 


As will be seen, this function F(z) is significant as a “Gegenbeispiel” also 
in some of our later theorems. 


Cf. Footnote 13. 


on 
if . 
j 
4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


CHAPTER II. BOUNDED FUNCTIONS: CONFIGURATIONS C, AND D, 


The problem which will occupy us in this chapter and the next is to what 
extent the results of the first chapter can be extended to the class of bounded 
functions. 

It should be remarked at the start that in §5, Corollary 2, it is not possible 
to drop the condition of univalence. Indeed we have 


THEOREM 1. There exists a function f(z) regular and bounded in |z| <1 and 
a sequence of points (|%n| <1), |2n| 1, for which 


lim inf | f’(zn)| (1 —| > 0. 


That |f’(z,)| (1—|s,|) is always bounded when f(z) is regular and 
bounded in | z| <1, follows from an easy application of Schwarz’s lemma(*). 
To prove Theorem 1 we consider the function(*) 


f(z) = exp [=]. 


It is clear that since R[(z+1)/(z—1)]<0 in | z| <1, we have | f(z)| <1 in 
| z| <1. Now, 


-1+9r 
| | (1—r?) =2 exp |, 


1 — 27 +9? — 2rcos6 + r? 


Along the curve r=cos @ which passes through the point z=1 and is tangent 
there to the unit circle 


| | (1 — = 2/e 


so that as 6-0, the corresponding limit is 2/e >0. 

10. A lower bound on D,(w). In order to obtain the conclusion 
limn-« |f’(n)| (1—|2,|) =0, it is necessary to limit oneself to particular se- 
quences {z,} in the circle |z| <1. By Theorem 1 our result is as follows: 


THEOREM 2. Let f(z) be regular and bounded in |z| <1: 
| 2) | 


let {z,} be any sequence of points in | 2| <1, and let w,=f(z,). Then, a necessary 
and sufficient condition for 


Jim | | (1 | = 0 
ts that lim,.. Di(w,) =0. 


(*) Cf. L. Bieberbach, Lehrbuch der Funktionentheorie, vol. 2, 2d edition, 1931, p. 112. 
(5) For this particularly simple example the authors are indebted to Professor G. Szegé. 


151 


152 W. SEIDEL AND J. L. WALSH [July 


This condition will follow directly from the more precise 
THEOREM 3. Let f(z) be regular and bounded in |z| <1: 


| #2) | = M, 
let 29 be any point in |z| <1, and let wo=f(zo). Then, the following inequality 
(10.1) D,(wo) | (zo) | (1— | zo |*) s [81 wo) 


is always satisfied. 


The first inequality in (10.1) is simply a particular case of §4, Theorem 2. 
It, therefore, remains to prove the second inequality alone. 
It was proved by Landau and Dieudonné(*) that if 
w=g(z)=2+--- 
is a regular function in | z| <1 satisfying the inequality 
|g(s)| SM for <1, 
then g(z) is univalent in the circle | z| <1/2M and covers simply the circle 


| w| $1/4M. 
Consider now the function 


_ + 20)/(1 + — f(zo) _ 

In | z| <1 the function ¢(z) is regular and satisfies the inequality 
2M 

feo) | (1 20 


Hence, in accordance with the theorem of Landau and Dieudonné w=¢(z) 
covers simply the.circle ; 


| o(2)| 


| | (1 — | 0 |?) 
8M 


| w| 


The function w=f(z), therefore, covers simply the circle 


| f’(z0) |2(1 — | 20 |)? 
8M 


| w — wo| Wo = f(z0). 


From this it follows that 


(#) E. Landau, Sitzungsberichte der Preussischen Akademie der Wissenschaften, Berlin, 
Physikalisch-Mathematische Klasse, (1926), pp. 467-474; J. Dieudonné, Annales de !’Ecole 
Normale Supérieure, (3), vol. 48 (1931), pp. 247-358. 


° 
ay 


FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


| ’(zo) |2(1 — | 20 
8M 


which is merely another form of the second inequality (10.1)(*7). 

While the constant 8 in (10.1) is not the best possible, the order [D,(wo)]*? 
as D,(w)—0 cannot be improved, as may be seen from a study in |z| <1 of 
the function 


D,(w0) 2 


— Mz) 


f(s) = 


M>1, 


previously considered by J. Dieudonné(**) in the neighborhood of the point 
z= M—[M*—1]”. Indeed, let s be any point of the unit circle, lying on the real 
axis, such that 0<z<M—[M?—1]"?. It is seen by direct computation that 
(1 — 2Mz + 2*)(1 — 2?) 


| | (1 — = M? 


(10.2) M*(1 — 2?) 
(M — z)? 


We set wo=f(M—(M?—1)/?) = Hence, since Di(w) = wo 
—W, 


[2 — (M — (M* — 1)*/*)] [2 — (M + (M? — 1)*/*)]. 


2 


M 
(10.3) Di(w) = — (M — (M? — 1)"/)]?, 
M-—z 


Comparison of the equations (10.2) and (10.3) shows that as w—wy 
[f’(2)| (1—2*) = but | f’(z)| (1 —2*) #0((Di(w))). 

11. Irregular sequences. The question now arises whether one may gen- 
eralize Theorem 3 to higher derivatives in the same manner as Theorems 4 
and 5 generalize Theorem 3 in Chapter I. In the present case, however, the 
situation is more complicated than in the case of univalent functions, as ex- 
amples (§12) will show. Before giving the examples it will be desirable to give 
some definitions and prove two theorems. Being given two points 2; and 2 
of the unit circle | z| <1, we define as in §7 the non-euclidean distance p(z:, zz) 
between them (**). 


DEFINITION 1. A sequence of points {2n}, (|2n| <1), 2x1, will be called 
a regular sequence for a function f(z) analytic in 2| <1 if there exists a number 


(87) It will be observed from the above that it might be of advantage sometimes to replace 
the right-hand side of (10.1) by [4M’D,(wo)]”* where M’ is the least upper bound of 
| —f(z0)| for |z| <1 and 20 fixed. 

(**) J. Dieudonné, ibid. 

(*) For the notions of non-euclidean geometry particularly in their relation to the theory 
of functions, cf. G. Julia, Principes Géométriques d’ Analyse, Premiére Partie, 1930, especially 
Chapters II and IV. 


1942] 153 
| 


154 W. SEIDEL AND J. L. WALSH [July 


X\>0 such that for any sequence of points { a, } whose non-euclidean distance 
P(Zn, Zn ) 1s less than d for all n we have a 


lim — f(zn’)] = 0. 


A sequence of points {2,} which is not regular will be called irregular. 


DEFINITION 2. A sequence of points (|Zn| <1), will be called'a 
quasi-regular sequence of order m for a function f(z) analytic in | z| <1if 


lim | f(z,) | (1 — | | )* = 0, fork =1,2,---+,m, 
while 
lim sup | f(™+(z,) | (1 — | zn| > 0. 


The case m= is allowed and means that lima... |f(zn)| (1—|2n|)*=0 for 
k=1,2,-->. 


Denote by I the non-euclidean circle of non-euclidean radius \ and non- 
euclidean center z,. We prove now the following 


THEOREM 4. An irregular sequence {2,} for a function f(z) regular and 
bounded in | z| <1 is quasi-regular of order m if to every sufficiently small posi- 
tive d there corresponds an integer N(A) >0 such that for all n> N(d) the function 
f(z) assumes the value f(Z,) exactly m+1 times in the circle T% (counting multi- 
plicities). 


Consider the function 


(11.1) = (222 ) — f(2n). 


By hypothesis, for »>N(A) the function g,(¢), which is regular and 
bounded in |¢| <1, assumes the value 0 exactly m+1 times in the circle 
|| <(e*—1)/(e*+1). Now, the sequence {zn} is assumed to be irregular. 
In accordance with Definition 1 this means that for any \>0 we can find 
a subsequence of the {z,}, which we shall denote by {2,,}, and a sequence 
such that p(Zn,, 24,) <A and for some 6>0 we have —f(z,)| = 6. 
This implies, however, that the sequence (11.1) cannot tend uniformly to zero 
in every closed subregion of |{| <1. Indeed, suppose that lim,.. ga(¢)=0 
uniformly in every closed subregion of | | <1. To any preassigned e>0 there 
would correspond a positive integer n(€) so that for »>n(e) we would have 
| <e in <(e\—1)/(+1). Setting we 
would infer that | gna(Sna)| <efor n>n/(e). Replacing this inequality in (11.1), 
we find | f(zh,) —f(2n,)| <e for n>n(e). If we choose e< 5, we arrive at a con- 
tradiction. 


ne 
We 
A 
. 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 155 


Hence, there exists(*) a subsequence of the sequence { gn(f) } , which we 
shall denote b { gn, (5) } , which converges uniformly in every closed subregion 
of |¢| <1 to which is not identically. zero, and (since G(0) = 0) 
is not identically a constant. The function G(¢) is regular in | c | <1. 

Since G(f) is not identically zero, there must exist a 0<\i1<A so that 
G(¢) #0 on the circle | = Since, furthermore, the sequence 
£n,(¢) converges uniformly to G(¢) on that circle, for sufficiently large values 
of we have g,,({) #0 on = (e1—1)/(e+1). Now, by hypothesis g,,(¢) 
vanishes precisely m+-1 times in the circle | c | <(e%—1)/(e"+1) provided 
> N(A1). Hence, by Hurwitz’s theorem G({) vanishes precisely m+1 times 
in the circle \¢ | <(e™"—1)/(e+1). But since \ may be taken arbitrarily 
small, G(¢) must have a zero of order m+1 at the origin. Hence, G’(0)=0, 
G’’(0)=0, --- , G™ (0) =0, G*+ (0) #0. In view of (11.1) and (2.2), we see 
from the relations g),,(0)—0, , (0)0, (0) 
that 

lim sup | f"+(z,) | (1 — | | > 0. 


On the other hand, suppose that for some integer 0<k<m-+1 and for 
some subsequence of the 


(11.2) | f (en) | (1 —| > 0 


for a suitable positive 7, independent of m’. Consider the corresponding sub- 
sequence {g,/(¢)} of the sequence (11.1). By selecting a further subsequence, 
if necessary, we may assume that the sequence {gn(E)} is a uniformly con- 
vergent one in every closed subregion of |{| <1. Two cases are possible ac- 
cording as { gnr(f)} converges to zero or to some function not identically 
a constant. In the first case, the derivatives of all orders of g,/({) also 
converge to zero and application of formula (2.2) for the case n=k shows 
that lf (1— | )*-0, which is a contradiction of (11.2). In the sec- 
ond case, the nonconstant limit function G(¢) of the sequence g,’({) by the 
argument already given must have a zero of order m+1 at the origin, so that 
all its derivatives up to the (m+1)st must vanish at the origin. Application 
of formula (2.2) again contradicts (11.2). Thus, in both cases (11.2) yields 
a contradiction. Hence, 


lim | | (1 — | )* = 0, k=1,2,---,m, 


and the sequence {z,} is quasi-regular of order m. 
We proceed to prove some related results. 


(*) This follows from the fact that the functions gn({), being uniformly bounded in their 
totality in |z| <1, form a normal family, cf. P. Montel, Legons Sur les Familles Normales de 
Fonctions Analytiques, 1927, p. 21. 


no 
3 


156 W. SEIDEL AND J. L. WALSH [July 


THEOREM 5. A necessary and sufficient condition that {,} be a regular se- 
quence for a function f(z) regular in |z| <1and bounded there: | f(z)| <M, is that 
it be quasi-regular of infinite order. . 


The condition is necessary. Indeed, form the functions 


(11.3) gn(t) = =) — fin). 


This sequence of functions is uniformly bounded in l¢| <1. From every sub- 
sequence can be extracted a new subsequence whose uniform limit is zero 
in every circle p({,0)<X, where A is the number of Definition 1. It fol- 
lows that the sequence g,({) converges uniformly to zero in the circle 
|¢| Lemma 1 of §2 shows that |f(z,)| (1—|,|)#—0 

The condition is sufficient. Again form the functions (11.3). Since 
| gn(¢) | $2M in |¢| <1, the functions form a normal family. A suitable sub- 
sequence converges uniformly in every circle |¢| Sd<1 to some function 
which is regular and bounded in | <1 and G(0)=0. Expanding 
in a Taylor series about {=0: 


(11.4) = + eof? 


applying Lemma 2 of. §2 and the hypothesis that {z,} is quasi-regular of 
infinite order, we see that all the coefficients in the expansion (11.4) are zero 
and that therefore G(¢) =0. 

Since we may repeat this argument starting with any subsequence of the 
family {g,({)}, it follows that g,(¢)—0 uniformly in any circle |¢| Sd<1. 
From this follows at once the fact that {z,} is a regular sequence for f(z). 

A type of converse of Theorem 4 may be stated in the following form: 


THEOREM 6. Let f(z) be regular and bounded in the unit circle |s| <1: 
|f(s)| SM. Let the sequence {z,} be quasi-regular of order m. Then for every 
subsequence of the {%,} there exists a new subsequence {%,,} with the property 
that to every p>0 which is sufficiently small there corresponds an integer N(p) >0 
such that for all n,.>N(p) the function f(z) assumes the value f(2,,) precisely 
m-+-1 times in the circle T%, (counting multiplicities). 


Again we form the functions (11.3). In view of Lemma 2 of §2 we have 


(p) 2. p—» 
( ) 1,92 @—»)! 


The hypothesis that {z,} is a quasi-regular sequence of order m for f(z) im- 
plies that 


(11.5) lim g.” (0) = 0, for p = 1,2,°--,m, 


a 

| 
no 
‘ 
j 
Pig 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 
while 
(11.6) lim sup (0)| > 0. 

n— 


Let us select a subsequence of the family { gn(t)} for which the actual limit 
in (11.6) exists and is positive and denote this subsequence for simplicity 
by {ga(¢)} again. Since for all we have | g,(¢)| in |¢| <1, the sequence 

ga(¢)} is a normal family. We may therefore extract a further subsequence 

gn,(f) } which in every closed subregion of the circle lg] <1 converges uni- 
formly to a function G(¢). According to (11.5) and (11.6) we obtain G‘”) (0) =0 
for p=1,2,--+, mand (0) #0. Since G(0) =0, it follows that for every 
p>0 which is sufficiently small the function G(¢{) has precisely m+1 zeros in 
the circle | {| <p and is different from zero on the circumference | {| =p. Let 
us fix a definite value of p. By Hurwitz’s theorem it follows that there exists 
an integer N(p)>0 so that each function g,,(¢) for which m,>N(p) has pre- 
cisely m+1 zeros in the circle |¢] <p. The theorem then follows immediately 
from the definition (11.3) of gn(f). 

12. Counterexamples (Gegenbeispiele). Theorem 4 may be used to ob- 
tain an example in which D,(w,) =0, while |f’’(z,)| (1—|,|)* does not tend 
to zero. Indeed, consider the Blaschke product(*') 
(12.1) 


As is well known (**), since | |2.:(7!—1)/(!+1) converges, the product (12.1) 
represents in the circle || <1 an analytic function whose absolute value is 
less than unity. As was shown by one of the authors(**), the sequence {z,} is 
an irregular sequence for ¢(z). Now, form 
= [o(2)F. 

Again, the sequence { Sn} is an irregular sequence for f(z). Furthermore, since 
the z, are zeros of order 2 and the only zeros of f(z), we have D,(0) =0 when 
the point w=0 is considered in any sheet of the Riemann configuration for 
w=f(z). On the other hand, the non-euclidean distance 2241) =log (n+1) 
—«, Hence, for any \>0 and for sufficiently large values of » the function 
f(s) vanishes precisely twice in I. Applying Theorem 4, therefore, we find 


lim sup | | (1 — | )* > 0. 


(“) Such products were first introduced by W. Blaschke, Berichte tiber die Verhandlungen 
der Sichsischen Akademie der Wissenschaften, Mathematisch-Physische Klasse, Leipzig, vol. 
67 (1915), pp. 194-200. 

(#) Cf. G. Julia, ibid., pp. 65-66. 
(*) W. Seidel, these Transactions, vol. 34 (1932), pp. 14-15. Equation (7.2) there should 
1 — 2/tn 


read 


158 W. SEIDEL AND J. L. WALSH [July 


Let us state this example as a theorem: 


THEOREM 7. There exists a bounded regular function f(z) in |z| <1 anda 
sequence of points {zn} (|%n| <1, |n|—+1) in <1 such that, setting w.=f(2n), 
limy Di(wa) =0, while lim (1—| )?>0. 


Indeed, for the specific example already given we may assert f’(z,) =0, 
Dy, (wn) = 0. 
The converse situation may also arise: 


THEOREM 8. There exists a bounded regular function f(z) in |z| <1 and a 
sequence of points {2m} <1, +1) in |z| <1 such that, setting wn=f(Zn), 
we have lim infy.. D:(w,) >0, while |f’’(2n)| (1—| )?=0. 


Let 
1+2 


= W = ——. 


The function W=(1+2)/(1—z) maps the circle | s| <1 on the half-plane 
RW>0. Now, forRW>0 

| e-W+1| = exp (— RW + 1) <e. 
Hence, for |z| <1, |¢(s)| <(1+e)2. 


Direct computation shows that 


8 8 
= W+i(e-W+l — 4 W+l({ — 2e-Wtl), 


We now choose the points 2,=n7i/(nwi+1). It is clear that |z,| <1 and 
= 1. Setting W,=(1+2,)/(1—2,), we find W,=1+2n7mi. Hence 


(12.2) | | (1 —|2n|) = 0, as no, 


This incidentally gives another example for the proof of Theorem 7, since 
the relation D;(w,)—0 follows from (12.2) and (10.1). 
Next, we introduce the function 


1 
Again, we observe that in || <1 the function ¥/(s) is bounded: 
| ¥(2)| 


Choosing again z, = we find 
(12.3) | WV’ (Zn) \(1 ~ | Zn |*) = 2, for all n, 


4 

7 

1 

4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


and 
(12.4) —| as no. 
Finally, we introduce the function 
F(z) = o(2) + 
It is clear that f(z) is bounded in the circle |s| <1, satisfying there the in- 


equalit 
| f(z) | < (1 + 2¢. 


For the sequence of points 2, = nwi/(nmi+1) by virtue of (12.2), (12.3), (12.4) 
we have the relations 


| | (1 — | |?) = 4 0, for all n, 
and 
— | |?)? 0 as 


It follows from Theorem 3 that D;(w,) has a positive lower bound. The func- 
tion f(z) is therefore an example of a function with the properties asserted in 
Theorem 8, and Theorem 8 is established. 

By forming the function f(z) =a@(z)+dy(z) with arbitrary constants a 
and 6b, one can now obtain arbitrary limits for | f'(@n)| (1— | Z| 2) and 
(1—| 2)? as 


It may be observed that if the condition |z,|—+1 in Theorems 7 and 8 
were dropped, one might take as examples to prove the theorems the simple 
functions f(z) =z? and f(z) =z, z,=0, respectively. 

Finally, it may be noted that in the Theorems 4-6, the boundedness of 
f(z) was assumed merely in order to ensure the normality of the family g,(¢). 
Thus, it would have sufficed to assume that f(z) has 2 exceptional values and 
is bounded. 


13. Definition and some properties of C,. 


DEFINITION 3. Let C, be a simply connected Riemann configuration contain- 
ing the point wo, lying over the circle |w—wo| <p and covering it precisely p 
times. Such a region C, will be called a p-sheeted circle of center wo and radius p. 


We shall exclude the case p= © (called an improper p-sheeted circle) for a 
reason that will be given a little later. It should be observed that the center 
of a p-sheeted circle is not uniquely defined. 

The necessity of assuming explicitly (rather than proving) in Definition 3 
that C, shall be simply connected may be seen from the following example. 
Consider in the w-plane the (simply connected) Riemann surface of the func- 
tion ((w—a)/(w—8))*/? where a and 8 are two complex numbers, with the 
branch line the rectilinear segment af. Let us now cut this surface by a circu- 
lar biscuit-cutter which includes the two points a and §. The resulting circular 


SC 159 
| 


160 W. SEIDEL AND J. L. WALSH [July 


region cut out of the surface satisfies all the requirements in Definition 3 ex- 
cept the condition of simple connectivity. In fact, every region lying over a 
circle | w—wo| <p and covering it precisely twice ceases to be simply con- 
nected as soon as it has two branch points or more. Indeed, in such a case it is 
clearly possible to find a cut joining two boundary points and crossing a 
branch line which will not sever the surface. In general, by applying the theo- 
rem of Bécher and Walsh (as in the proof of Theorem 13 below) one may 
easily show that every region lying over a circle | w—wo| <p and covering it 
precisely » times ceases to be simply connected as soon as the sum of the mul- 
tiplicities of its branch points exceeds p—1. The multiplicity of a branch 
point is to be understood as one less than the number of sheets which come 
together at that point. An algebraic branch point (but not a transcendental 
one) is to be considered as belonging to the Riemann configuration. 
One may prove some immediate consequences of Definition 3. 


THEOREM 9. Any p-sheeted circle over the w-plane can be mapped in a one-to- 
one conformal manner on the unit circle |2| <1. 


According to the fundamental theorem of uniformization the p-sheeted 
circle C,, being simply connected, may be mapped in a one-to-one conformal 
manner either on a circle, or on the full plane, or on the full plane from which 
the point at infinity is excluded. Denote the mapping function by w=f(z). 
Since C, is a bounded region, the function f(z) must be bounded. This is cer- 
tainly not possible in the two latter cases. Thus, C, can be mapped only on a 
circle. 


THEOREM 10. A p-sheeted circle C, with center wo and radius p can be mapped 
in a one-to-one conformal manner on the unit circle | 2| <1 by means of a func- 
tion of the form 

(13.1) f(z) = wo + ———., 


j=1 i- 


where @ is an arbitrary real number, k an integer satisfying the inequality 
0<kSp, and where 2, 2, +--+, Zp-~ are points of the unit circle | z| <1. Con- 
versely, every function of the form (13.1) realizes a one-to-one conformal map of 
the unit circle |z| <1 on some p-sheeted circle with center at wo and radius p. 


In speaking of conformality, it must be remembered that it will break 
down at a branch point. To prove the first part of the theorem introduce a 
similarity transformation in the w-plane with center in wo which transforms 
the circle C, into a p-sheeted circle CJ of radius 1. By means of a trans- 
lation we can always bring the point wo into the origin. The resulting one- 
to-one map of |z| <1 on C, can be interpreted as a (1, p) conformal corre- 


(“) T. Radé, Acta Litterarum ac Scientiarum Regiae Universitatis Hungaricae Francisco- 
Josephinae, Szeged, vol. 1 (1922), p. 55. Preliminary related work is due also to Fatou and Julia. 


i 
~ 
wr 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 161 


spondence of a unit circle on itself. By applying Rad6é’s theorem(“) on the 
representation of such correspondences we obtain the expression (13.1). The 
converse may also be derived from Radé6’s theorem together with a trans- 
lation and similarity transformation in the w-plane. 

A remark will now be made to justify the exclusion of the case p= © in 
the definition of C,. An improper p-sheeted circle could be interpreted as the 
w-plane covered precisely ~ times. If such a circle belonged to a simply con- 
nected Riemann surface, the surface could not be of hyperbolic type and 
consequently Theorems 9 and 10 would no longer apply. Suppose first that 
a simply connected Riemann surface which contains an improper p-sheeted 
circle could be mapped conformally on the unit circle. Thereby the p-sheeted 
circle would be transformed into a simply connected subregion of the unit 
circle. Now if the improper p-sheeted circle has no boundary points such a 
transformation is clearly impossible. Suppose then that the p-sheeted circle 
has the point w= ” as a boundary point. Then, the mapping function in the 
unit circle approaches infinity whenever the point z approaches the boundary 
of the subregion. This is again impossible. 


THEOREM 11. Let C, be a p-sheeted circle with center at wy and radius R. Let 
Cp be a subregion of C, which lies over a circle | w—wo| <r, where r<R, and 
covers it precisely p times. Then, c, is also simply connected. 


We can map C, on the unit circle || <1 in accordance with Theorem 9. 
The mapping function w=f(z) is regular in | s| <1 and maps c, on a certain 
subregion B of |z| <1. On the boundary I of B we have | f(z) —wo| =r, while 
in the interior of B we have | f(z) —wo| <r. From the maximum modulus prin- 
ciple it follows that B is simply connected. Since the map defined by w=/(z) 
is topological, the image of c, of B must likewise be simply connected. 

In order to establish the uniqueness in Definition 3, we shall prove 


THEOREM 12. Let R be a simply connected Riemann surface of hyperbolic 
type. Let wo be a point of R. Let C, and Cy be two p-sheeted circles with center 
at wo and radius p. Then, C, and Cy are identical. 


If we map R on the unit circle | 2| <i by means of the function w=f(z) 
so that f(0) = wo, the two circles C, and Cy will be mapped on two regions B 
and B’ belonging to the circle |z| <1. In the interiors of B and B’ we have 
| f(z) —wo| <p and on the boundaries | f(z) —wo| =p. Furthermore, both re- 
gions B and B’ contain the origin. Thus, unless B and B’ are identical at 
least one boundary point of one region, say B, will be interior to the other 
region B’. This, however, constitutes a contradiction. 

14. Definition of D,. 


DEFINITION. Let w=f(z) regular in the unit circle || <1 map the circle 
on a Riemann configuration R. That is to say, Ris an arbitrary simply connected 
Riemann configuration of hyperbolic type over the finite w-plane. Let wo be an 


A 
- 


162 W. SEIDEL AND J. L. WALSH [July 


arbitrary point belonging to R. A non-negative number D,(wo), called the radius 
of p-valence of R at the point wo, shall be associated with the point wo in the fol- 
lowing manner: 

(a) For p=1, we define D,(wo) = D1(wo) (see §1). 

(b) If there exists a p-sheeted circle with center wo contained in R, there exists 
a largest such circle, and the radius of this largest circle is defined as D,(wo). 

(c) If p>1, and if wo is a branch point of order greater than p—1, then 
D,(wo) =9. 

(d) If there exists no p-sheeted circle (pb>1) with center wo contained in R, 
and if wo is not a branch point of order greater than p—1, then we define D,(wo) 
as Dy_1(w). 


It should be observed that in the definition in part (b) the existence of a 
largest p-sheéted circle with center in wo contained in R is asserted and still 
requires some justification. From Theorem 12 it follows that if such a circle 
exists, it must be unique. Furthermore, as one starts with a p-sheeted circle 
with center in wo contained in R and proceeds to enlarge its radius, it can 
never happen that it becomes multiply connected and on enlarging the radius 
still more, finally again becomes simply connected. This possibility is ruled 
out by Theorem 11. Finally, the existence of a p-sheeted circle with center 
in Wo and contained in R whose radius is the least upper bound of the radii 
of all p-sheeted circles with center in wo and contained in R can be established 
by simple considerations of continuity, which are left to the reader. 

The number D,(w») is not, as the notation would seem to indicate, a func- 
tion merely of wo, a value of w, but is rather a function of a specific point of R 
whose affix is wo; thus D,(wo) is precisely a function of zo, where R is deter- 
mined by the transformation w=f(z). However, no confusion is likely to re- 
sult from the slight lack of definiteness in the notation D,(wo). We denote 
by R,(wo) the unique region of R which is a g-sheeted circle C, (¢Sp) whose 
center is wo and radius D,(wo). 

For the sake of clearness, we present now a numerical illustration of the 
definition of D,(wo). Let R consist of the doubly-carpeted unit circle | w| <1 
with branch point of the first order at the origin w=0, except that in the 
second sheet there is deleted the subregion of | w| <1 contained in the region 
| w-+1| <1/3; for definiteness choose the branch line as the segment 0S w<1; 
of course this configuration R can be mapped in a one-to-one manner on 
| s| <1 by a single-valued function w=f(z), as can be seen at once by use of 
the auxiliary transformation w=2), which maps R onto a smooth Jordan re- 
gion of the 2:-plane. We obviously have D,(0) = 2/3, for the doubly-carpeted 
(that is, two-sheeted) circle | w| <2/3 is contained in R, and that is true of no 
larger concentric doubly-carpeted circle. When wp» is positive, and in either 
sheet of R, we have 


D2(wo) = wo + 2/3, 0S wm 1/6, 


tit 
4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 163 


D2(w») =1-— Wo, 1/6 Sm < 


for positive wo, the size of the region R2(w») is limited by the nearer of the 
two points —2/3, +1. When wo moves from the origin to the left in the first 
sheet, the size of R2(wo) continues to be limited by the point w= —2/3: 


D2( wo) = W+ 2/3, ~ 1/3 Sm< 0. 


But when the point wo continues to the left from the point wo= —1/3, the 
size of R2(wo) is now no longer limited by the point —2/3, but is conditioned 
by the necessity of including no point of | w+i | <1/3, hence is limited by the 
origin; the corresponding region cut out of R is smooth, merely the region 
| w—awo| <| wol : 

D2(wo) = — Wo, — 1/25 ms — 1/3. 


As wo moves further to the left from w= —1/2, still in the first sheet of R, 
the region R2(wo) is now limited only by the point w= —1: 


Do(wo) = 1+ wo, —1< ms — 1/2. 


When w» moves from the origin to the left in the second sheet of R, the size 
of R2(wo) is also limited by the point w= —2/3: 


D2(wo) = wo + 2/3, —2/3<w<0; 


this situation continues as wo moves from the value zero to the value —2/3, 
but the region R2(wo) is a doubly-carpeted circle for —1/3<w<0, and is 
singly-carpeted (smooth) for —2/3<ws —1/3. This completes the study of 
our numerical case. 

Let us now discuss the manner in which D,(wo) and R,(wo) vary on the 
general Riemann configuration R, the image of | z| <1 under the arbitrary 
map w=f(z), where f(z) is analytic in |z| <1. The various possibilities that 
arise are illustrated by the example just given. We cut all the sheets of R 
through with a circular biscuit-cutter whose center is wo and whose radius 
is the variable r. One of the connected sets thus cut out of R contains w» and 
is denoted by R:. When 7 is small it follows from the usual implicit function 
theorem that if wo is not a branch point of R the region R is smooth, and if wo 
is a g-fold point of R, then R; consists of a g-sheeted circle whose only branch 
point is wo. As 7 is gradually increased, this situation continues until the 
boundary of R; reaches either a boundary point of R or a branch point of R. 
In the former case we have D,(w») equal to this particular value 7 of 7, and 
R: is R,(wo). In the latter case if r is further increased, it may be that R, be- 
comes a q’-sheeted circle with g<q’ Sp, in which case we have D,(wo) 27 >11. 
But it may occur that whenever r is near to but greater than 7 the region R; 
is a g’’-sheeted circle, g’’ >, in which case we have D,(wo) =11; it may also 
occur that whenever r is near to but greater than 7; the region R; has bound- 
ary points in common with R, in which case we have also D,(wo) =1. If we 


164 W. SEIDEL AND J. L. WALSH [July 


have D,(wo)>m, the radius r can be perhaps increased until still further 
branch points of R lie interior to Ri, while Ri remains a g:-sheeted circle whose 
center is wo, with g: Sp. In any case the radius r can be increased from zero to 
such a value rz that: (i) either a boundary point of R lies on the boundary of 
R,, (ii) or there lie on the boundary of R: branch points of R of such multiplici- 
ties that for all values of r slightly greater than rz the region Rz containing wo 
and cut out of R by the biscuit-cutter with center wo and radius 7 is a 
q’’-sheeted circle with g’’ >, (iii) or there lie on the boundary of R; branch 
points of R of such nature that for all values of r slightly greater than 12 this 
region R; has boundary points which satisfy the relation | w—wo| <rz. It is to 
be noted that if the biscuit-cutter of radius r cuts from R the region R; con- 
taining wo, and if R; has a boundary point w; (necessarily a boundary point 
of R) for which | Wi — Wo| <r, then we must have D,(wo) <r. For under these 
conditions R; cannot be a g-sheeted circle; the point w; of the w-plane may be 
covered by R; precisely g times (not necessarily by g sheets meeting at w), 
but then (by the implicit function theorem) a suitably chosen neighborhood 
of w; is also covered precisely g times by the sheets of Ri that cover w, and 
suitable points w in this neighborhood are covered more than g times in all, for 
they are covered also by R; in the neighborhood of the boundary point w. 

It is of interest to trace also the situation in the z-plane corresponding to 
the preceding discussion. When 1 is sufficiently small, r>0, the locus 
|f(z) —wo| =r consists (in addition to possible other arcs or curves) of a 
Jordan curve J(r) in the neighborhood of the point zo, where wo=f(z0); for 
r sufficiently small, interior to J(r) the function f(z) takes on every value that 
it assumes (by Theorem 13 below) precisely a number of times equal to the 
multiplicity g of zo as a zero of the function f(z) —wo; the image of the in- 
terior of J(r) over the w-plane is a g-sheeted circle of radius r whose only 
branch point is w=w». As r now increases, this situation continues until J(r) 
reaches | z| =1 or until at least one multiple point of J(r) appears (at a multi- 
ple point the tangents to J(r) are equally spaced); in the former case we 
simply have D,(wo) equal to the corresponding value 1; of 7; in the latter case 
for values of r near to but slightly greater than 7, the locus | f(z) —wo| =f 
consists of a Jordan arc J; near but exterior to J(r:) plus other Jordan arcs 
forming with J; a maximal connected set which we denote by /J(r); still other 
Jordan arcs may belong to the locus and not be connected with J;, but such 
arcs do not concern us at present. If for every 7 near to but slightly greater 
than 7; the set J(r) has a boundary point on |z| =1, then we have D,(w) =11; 
in the contrary case J(r) consists of a Jordan curve in |z| <1 containing J(r:) 
in its interior; the function f(z) takes on interior to J(r) all the values that 
it takes on there the same number of times, say q’. If q’ is greater than p we 
have D,(wo) =n, but if g’ is not greater than p we have D,(wo) >, and the 
process of enlarging J(r) can continue beyond r=r;. The process continues 
as r increases, and J(r) may pass through multiple points, thereby increasing 


ead 


1942) FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 165 


not merely r but also the number of times (the same for all values) that f(z) 
takes on interior to J(r) values that it takes on there. The process eventually 
comes to an end at some value r=72=D,(wo), either because J(r2) reaches 
the boundary lz = 1 and hence is no longer a Jordan curve in || <1, or be- 
cause the locus f(z) —wo| =r, has a multiple point, and for every r>r2 but 
near to rz the locus | f(z) —wo| =r either fails now to separate Zo from || =1 
or divides || <1 into regions of which the one containing go is a Jordan region 
in which each value assumed is assumed more than p times. 

15. Some properties of D,. We return now to the general theory of 
D,(wo); an important tool is(“) 


THEOREM 13. Let f(z) not identically constant be analytic in the simply con- : 
nected region B, let | f(s)| be continuous in the corresponding closed region and 
have the constant value b on the boundary C of B. Then all values w taken on by 
S(z) in B are taken on there the same number of times q, and f'(z) has precisely 
zeros interior to B. 


The region B cannot be the entire plane or the entire plane with the omis- 
sion of a single point, so B can be mapped conformally onto the interior of 
the unit circle y. It is sufficient to establish the theorem where B is the in- 
terior of y, which we shall now do. We must have 5 >0, so by the well known 
properties of the maxima and minima of | f(z)| , the zeros of f(z) interior to y 
are finite in number, f:, Bz, - - - , 8, with g>0. The function 

wh 
ket 2 — BE 
when suitably defined in the points §;, is analytic and different from zero at 
every point interior to 7; its modulus is continuous in the corresponding closed 
region and takes the constant value 6 on y. Hence this function itself is a 
constant of modulus b, and we have 
Be 
f(z) = wT] = 1. 
Biz 
The first part of Theorem 13 now follows from Rouché’s theorem, for if we 
have |c| <b we have on 7 the inequality |c| <|f(z)|. The latter part of Theo- 
rem 13 follows from a theorem due to Bécher and Walsh(“). 


THEOREM 14. Let the function w=f(z) analytic for |z| <r with f(0) =0 map 
|3| <r onto a Riemann configuration R such that no point of the boundary of R 


(*) The part of this theorem which refers to the zeros of f’(z) is not new, if g is defined as 
the number of zeros of f(z) in B, and has been considered by de Boer, Macdonald, de la Vallée 
Poussin, Whittaker and Watson, Denjoy, Lange-Nielsen, and Alander. See for instance Denjoy, 
Comptes Rendus de I’ Académie des Sciences, Paris, vol. 166 (1918), pp. 31-33; Alander, Comptes 
Rendus de I’ Académie des Sciences, Paris, vol. 184 (1927), pp. 1411-1413. 

(*) J. L. Walsh, these Transactions, vol. 19 (1918), pp. 291-298, especially p. 297. 


| 


166 W. SEIDEL AND J. L. WALSH [July 


satisfies the inequality |w| <p>0. Then the connected region R, of R which con- 
tains the transform of z=0 and which is cut out of R by a biscuit-cutter whose 
center is w=0 and radius p is simply connected; and each point w of the w-plane 
with | w| <p is covered by Ri the same number of times. 


The region R, corresponds to some region R; in | z| <r containing z=0. 
The function |f(z)| is continuous in the closed region consisting of R: plus 
its boundary, and assumes the constant value p on the boundary; of course 
the boundary of R; may coincide in whole or in part with |z| =r. It follows 
from the principle of maximum modulus applied to f(z) in | 2| <r that the 
boundary of R: cannot fall into two or more continua, one of which would 
necessarily lie in a simply connected region interior to | z| =r bounded by an- 
other continuum belonging to the boundary of R:. Then R, is simply con- 
nected, and so consequently is R,. The remainder of Theorem 14 follows from 
Theorem 13. 


THEOREM 15. Let w=f(z) be analytic for |z| <1 and map |z| <1 onto the 
Riemann configuration R with wo=f(Zo), | zo| <1. Let f(z) take on in | z| <1 
every value w in the region | w—1wo| <p>0O precisely p times. Then we have 
D,(wo) = p. 


No boundary point w; of R can satisfy the inequality |w:—wo| <p; for 
if it did the point w; of the w-plane would be covered by R a totality of p 
times, and by the implicit function theorem a suitably chosen neighborhood 
of w, would also be covered by R precisely p times by the sheets of R cover- 
ing w,. Some values w in every neighborhood of w; are covered also by the 
sheet (or sheets) of R of which w; is a boundary point; so some points w with 
| w—wo| <p are covered more than p times, contrary to hypothesis. 

We have now shown that no boundary point of R satisfies the inequality 
| w—w»| <p; so it follows from Theorem 14 that the region containing wo cut 
out of R by a biscuit-cutter of center wo and radius p covers each point of 
| w—2wo| <p the same number of times, a number which by the hypothesis 
of Theorem 15 cannot exceed p; hence Theorem 15 is established. 

Still another result related to Theorems 14 and 15 follows easily: 


THEOREM 16. Let the function w=f(z) be analytic for | z| <1 and map 
|z| <1 onto the Riemann configuration R with wo=f(z0), |zo| <1. Suppose 
lim inf.) +1 | f(z) —wo| 2p, and suppose no value in |w—wo| <p is taken on 
by f(z) in | z| <1 more than p times. Then we have D,(wo) =p. 


It follows from our hypothesis that no boundary point of R lies in 
| w—wo| <p; so Theorem 16 follows from Theorems 14 and 15. 


CorROLLARY. Let w=f(z) be analytic for | 2| <1land map | 2| <1 onto the Rie- 
mann configuration R with wo =f(z0), | <1. Let be a subregion of | z| <1 
containing %o, whose boundary B satisfies the condition lim,.z,\2\ <1 | f(z) —wo| 


& 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 167 


=p>0, and suppose no value w is taken on by f(z) in R, more than p times. 
Then we have D,(wo) =p. 


It follows from the principle of maximum modulus that R; is simply con- 
nected. If Ri is mapped smoothly and conformally onto |¢ | <1, and if Theo- 
rem 16 is applied to the function which maps \¢| <1 onto R,, we obtain the 
corollary. 

Although the following theorem is not needed in the sequel, it is of some 
interest in itself. 


THEOREM 17. Let R be a simply connected Riemann configuration of hyper- 
bolic type, and let wo be any point of R. Then D,(wo) is a continuous function 
of Wo. 


We need to define what we shall mean by the continuity of D,(wo) on R. 
If wo is a branch point of order greater than p—1, (p>1), we shall say that 
D,(wo) is continuous at wo if. to any ¢>0O we can assign a number 5>0 so 
that for any point wg at a distance not greater than 6 from wy and lying on one 
of the sheets that come together at wo the relation | Do(we )—D,(wo)| <e 
holds; here D,(wo) =0. If wo is a branch point of order g, where OS gSp-—1, 
we shall say that D,(wo) is continuous at wo if to any €>0 we can assign a 
number 6>0 so that for any point wy within the g-sheeted circle C, with 
center at wo and radius 6 the relation | D,(wé ) —D,(wo) | <e holds. The proof 
of this theorem is left to the reader. 

16. The limit property of D, for continuously convergent sequences. 


THEOREM 18. Let {f,(z)} be a sequence of functions analytic in the unit 
circle | z| <1, and converging uniformly in every closed subregion of | s| <1 to 
an analytic function f(z). Let 20 be any point in the circle |z| <1 and set 
Wn=fn(Z0), Wo=f(%0). Denoting by D, (wa) the radius of p-valence at the point w, 
of the Riemann configuration R,, on which f,(2) maps the circle | z| <1 and by 
D,(wo) the radius of p-valence at the point wy of the Riemann configuration Ry 
on which f(z) maps the circle |z| <1, we have 


(16. 1) lim D,(Wn) D,(w»), 1, 2,3,-+-. 


The proof of the theorem will be based on two lemmas: 
Lemma 1. Under the conditions of Theorem 18, 
(16.2) lim inf D,(w,) = D,(w»). 


no 
The lemma is clearly trivial if D,(wo) =0. 
Let us assume, therefore, that D,(wo) >0 and choose any positive num- 
ber p so that p<D,(wo). Hence, the Riemann configuration Rp contains in its 
interior some g-sheeted circle C,(wo), 1 Sq , of center wo and radius p, to- 


{ 
‘ 4 


168 W. SEIDEL AND J. L. WALSH [July 


gether with its boundary. Denote the region in |z| <1 on which the function 
w=f(z) maps C,(wo) by Ro. The boundary By of Ro must consequently lie 
wholly in the interior of | z| <1. In Ro we have | f(z) —wo| <p and on By 
we have | f(z) —wo| =p. Let €>0 be any number such that p+e<D,(w»). 
Due to the uniform convergence of the sequence f,(z) on Bo, there exists a 
positive integer m(e) such that for all integers »>n(e) the inequality 
|fn(z) —wn| >p—e holds on Bo. Hence, that region R, in the circle |z| <1 
which contains the point z9 and on which | fn(2) —w,| <p—e lies wholly in- 
terior to Ro. On the boundary B, of R, we have | fn(2) —w,| =p—e. In ac- 
cordance with Theorem 13 the function f,,(z) takes on all its values the same 
number of times g, in R,. By Hurwitz’s theorem, since f(z) is at most 
p-valent(*’) in Ro, for sufficiently large values of we have q,5. Hence, 
by the corollary to Theorem 16 we have D,(w,) 2D,,(w,) 2p—e. Hence, 
lim inf,... D,(w,) 2p. But p is an arbitrary positive number less than D,(wo). . 
The relation (16.2) follows at once. 


LEMMA 2. Under the conditions of Theorem 18, 
(16.3) lim sup D,(w,) S Dy(w»). 


If this lemma is false there must exist a positive constant a such that for 
infinitely many values of n 


(16.4) D,(wn) > a > D,(w»). 


We shall neglect all those functions f,(z) for which the above inequality fails 
and assume that (16.4) holds for all n. 

Consider that largest region R, in the circle |z| <1 which contains the 
point Zo, for which | f2(2) —w,| <a. Then in R, the function f,(z) is g-valent 
(qs). According to (16.4) the boundary C, of the region R, lies wholly in 
the circle |z| <1. Furthermore, by the principle of maximum modulus we 
conclude that R, is simply connected. Clearly, on the curve C, the relation 
|f.(2) —w,| =a is satisfied. Every value taken on by f,(z) in R, is taken on 
the same number of times. 

Denote by z=¢,(¢) a function which maps the region R, on the circle 
|¢] <1 in such a manner that ¢,(0) =zo. Since the curve C, is a Jordan curve, 
by a well known theorem of Osgood-Carathéodory the function ¢,(¢) is con- 
tinuous in the closed circle t| =< 1(**). The function f,(¢,(t)) = g,(t) is analytic 
in |t| <1, continuous in |¢] $1 and | g,(t)—w,| =a on |t| =1. By Schwarz’s 
reflection principle(**), we infer that g,(¢) is analytic in the closed circle 


(47) We shall say that a function f(z) is p-valent in a region R if it assumes .no value more 
than times in R and at least one value precisely » times. A function f(z) will be called at most 
p-valent in R if it is g-valent in R for some gS. 

(48) W. F. Osgood and E. H. Taylor, these Transactions, vol. 14 (1913), pp. 277-298; 
C. Carathéodory, Mathematische Annalen, vol. 73 (1913), pp. 305-320. 

(*) Cf. G. Julia, loc. cit., p. 44 ff. 


4 
+g 


1942] ,.. « EUNCTIONS ANALYTIC IN THE UNIT CIRCLE 169 


|t| $1. Finally, g,(t) is precisely g-valent in |t| <1 since f,(z) possesses the 
same property in R,. By the theorem of Radé, referred to earlier, we may 
represent g,(¢) in the following manner: 


(16.5) galt) = wa + TT ———_, 
jm 


—4; (n) 


ke21;|t; | <1. 


Since the g,(¢) are uniformly bounded, they form a normal family and we may 
select a subsequence, which for simplicity will again be denoted by {g,(¢)}, 
converging uniformly in every closed subregion of |t| <1 to a function G(t) 
analytic in |t| <1. On account of (16.5) G(¢) has itself a representation of the 
form 


ck t—t 
(16.6) = wo + T] k21. 
jot 1— bjt 


Just as in (16.5) some of the ¢; here may have the absolute value 1. 

Now consider that largest region Ro in |z| <1 which contains the point zo 
and in which |f(z)—wo| <a. According to the maximum modulus principle 
Ry is simply connected and we may map it on the circle |t| <1 by means 
of a function z=¢@o(t) so that ¢o(0) =z. On that part of the boundary By 
of Ro which lies interior to the circle |z| <1 if it exists we have | f(z) —wo| =a. 
We shall now show that Rp is the kernel of the sequence of regions {R, (5°). 
Indeed, consider any region Rj which together with its boundary lies interior 
to Ry and contains the point zo. By the definition of Ro, in the region Rj and 
on its boundary we have | f(z) —wo| <a. Since the functions f,(z) —w, con- 
verge uniformly to f(z) —w» in the closure of Rg , for sufficiently large we 
have |f,(z) —w,| <a in the closure of Rj, and therefore Rj belongs to all R, 
for sufficiently large values of m. Next, choose any point z’ of the circle 
|z| <1 exterior to Ry (if such a point exists). Connect the point z’ with the 
point z»9 by any Jordan arc L which lies wholly in the circle |z| <1. Since 2’ is 
exterior to Ro, there must exist on the arc L at least one point Z at which 
\f (Z) —wo| ><a. For sufficiently large values of n we must have | f.(Z) —w,| >a, 
and consequently Z is exterior to R,. Thus, on any Jordan arc joining the 
points z» and z’ there exists a point exterior to R, for all sufficiently large 
values of ». Consequently Ry is the kernel of the sequence of regions {R,}. 
Hence, by a well known theorem of Carathéodory(*') the sequence of func- 
tions ¢,(f) converges uniformly in every closed subregion of |t| <1 to the 
function ¢o(¢), provided merely we have chosen ¢,' (0) >0, ¢¢ (0) >0. 

If we form the function go(t) =f(¢o(t)), it follows that the sequence of func- 


(*) For the notion of kernel of a sequence of domains cf. C. Carathéodory, Conformal 
Representation, Cambridge Tract in Mathematics and Mathematical Physics, no. 28, (1932), 
pp. 74-77. 


(*) C. Carathéodory, loc. cit., particularly p. 76. 


* 


170 W. SEIDEL AND J. L. WALSH [July 


tions {g,(t)} converges uniformly in every closed subregion of ¥ <1 to the 
function go(t). We have shown earlier, however, that the sequence {g,(t) } con- 
verges to the function G(t) whose representation is given in (16.6). We thus 
find that go(¢) = G(t) identically in | t| <1. From (16.6) it follows therefore that 
go(t) is analytic in |t| $1, is g’-valent (g’Sp) in |t| <1, and on the circum- 
ference |t| =1 satisfies the relation | go(#) —wo| =a. 

Consider now an arbitrary positive number ¢ such that a—e>D,(wo). De- 
note by R, the largest region in || <1 which contains the origin and through- 
out which | go(t)—wo| <a—e. The boundary C, of this region lies wholly 
interior to |t| <1 and in R, the function go(t) is g’’-valent (g’’ Sp). The func- 
tion z =@o(t) maps the region R, on a region P, in the z-plane which is together 
with its boundary I’, interior to Ro. In P, we have |f(z) —wo| <a—e and onT, 
we have | f(z) —wo| =a—e. Since the region P, contains the point zo and since 
f(z) is q’’-valent in P,, it follows that a—¢<D,(wo). This contradicts our as- 
sumption concerning e. 

Since the assumption (16.4) leads to a contradiction, the relation (16.3) 
is true. 

We are now ready to prove the theorem. Lemmas 1 and 2 together yield 
the inequalities 

lim sup D, (wn) S D,(wo) S lim inf D,(w,). 
Since, however, we always have lim inf, D,(w,) Slim sup,.. it fol- 
lows that lim sup,.. D,(w,)=lim inf,... D,(w,) =lim,.. D,(wn) =D,(wo), 
which proves the theorem. 

17. lim,.. D,(w,)=0 is a necessary and sufficient condition for 
(1—|2n|)*=0 (k=1, 2,---, p). An immediate consequence 
of Theorem 18 is the following extension of Theorem 2, Chapter II, to the 
higher derivatives of bounded functions. 


THEOREM 19. Let f(z) be regular and bounded in |2| <1: 
| SM, 


let {2 } be any sequence of points in | 2| <1, and let w,=f(2,). Then, a necessary 
and sufficient condition for 


lim | (1 —|2n|)* = 0, 


is that lim,.. D,(w,) =0. 


We first prove the sufficiency of the condition. We assume that 
lim,.. D,(wn) =0. In accordance with the definition of the radius of p-va- 
lence it follows that 


(17.1) lim D,(w,) = 0, k= 1,2,---,p. 


1942] , FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 171 


By virtue of Theorem 2, Chapter II, the condition is sufficient for p=1. Let 
us assume that the condition is sufficient for p—1 and prove it to be suffi- 
cient for ». We assume therefore that (17.1) implies 


(17.2) lim | f(z.) |(1—|2.|)*=0, &=1,2,---,p—1. 


If the condition is not sufficient for p, we could find a positive constant 5 and a 


subsequence of {z,}, which for simplicity will again be denoted by {z,} for 
which 


(17.3) | | (1 —| 


and at the same time the relations (17.1) and (17.2) hold. 
Now if we introduce the sequence of functions 


= 


which are bounded and regular in | ¢| <1: |¢,(¢)| <M, we obtain by virtue of 
the expression (2.3) 


p! (p — »)! 
The relations (17.2) and (17.3) imply 
(17.4) lim inf | ¢.”(0)| = 8 > 0, 
while the relation (2.3) written out for n=1, 2,---, p—1 together with 


(17.2) shows that 
(17.5) lim |g. (0)|=0, for k=1,2,---,p—1. 


The sequence of functions {¢,({) } forms a normal family in | {| <1. We may, 
therefore, extract a convergent subsequence which for simplicity will again 


be denoted by {¢,(¢) } 
lim = $(f). 


The relations (17.4) and (17.5) imply 
(17.6) ¢(0)=0 for k=1,2,---,p—1; |oO)| 26. 


The equations in (17.6), however, imply that the radius of p-valence D, [o(0) } 
of the Riemann surface on which $(¢) maps the circle | c | <1 is positive at 
the point (0) of the surface D,[¢(0) |>0. According to Theorem 18 if we 


172 W. SEIDEL AND J. L. WALSH [July 


denote by D, [¢,(0) ] the radius of p-valence at the point ¢,(0) of the Riemann 
surface R, on which ¢,({) maps the circle | <1 and observe that ¢,(0) =w,, 
we obtain 


lim = Dp[e(0)] > 0. 


But R, is precisely the Riemann surface R on which f(z) maps the circle 
|z| <1. Hence, the last relation contradicts (17.1) for k=. This proves that 
the assumption (17.3) is false and the sufficiency of our condition is estab- 
lished. 

We now turn to the proof of the necessity of the condition in Theorem 19. 
Let us assume that 


lim | for k= 1,2,---, 9. 


Forming again the functions ¢,(¢), we see that 


lim | ¢,'(0)| = 0 for k= 1,2,-++, p. 


Let us assume that we have already selected a uniformly convergent subse- 
quence of the { bn(f) } , which, because of the normality of the family, is al- 
ways possible. The limit function ¢({) of the sequence has the property that 
¢)(0) = Ofork = 1, 2,--- p. Consequently, D,[¢(0) | =0 and by Theorem 18 


lim D,(w,) = 0. 


The last relation has been proved only for a subsequence of the original se- 
quence. But since from every sequence we may select a subsequence with this 
property, it must also hold for the whole sequence. Theorem 19 is now es- 
tablished. 

It will be noticed that Theorem 19 is unsatisfactory in that no indication 
is given of the manner in which expressions of the type |f‘(z,)| (1—|2n|)* 
depend on the radii of p-valence D,(w,). In the case p=1 we have already 
given inequalities which bring out this dependence (Theorem 3, Chapter II). 
Our next task will be to extend Theorem 3, Chapter II, to the higher deriva- 
tives of bounded functions. The constants that we shall obtain will, however, 
not be precise. We shall first study upper bounds for the derivatives of 
bounded functions. The inequalities that we shall obtain will, of course, yield 
a new proof of Theorem 19 by quantitative methods rather than the purely 
qualitative methods that we used in the present proof. 


CHAPTER III. BOUNDED FUNCTIONS; INEQUALITIES ON D, 


18. A preliminary lower bound for D,. For our purpose in the use of 
D,(wo) for the study of such relations as [p<>>(a,)| (1—|2|)?0, it is desir- 


no 
| 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 173 


able to have explicit numerical inequalities connecting D,(w,) and the deriva- 
tives f’(2:), f’’(z), We first prove regarding this relationship 


THEOREM 1. Suppose the function f(z) analytic for | z| <1 with f(0) =0, 
f‘(0) = p!, and with | f(z)| for |z| <1. Then we have 


(18.1) D,(0) 2 M, > 0, 


where M,= M,(M) is a suitably chosen constant depending on M and p but not 
on f(z). 


Our proof of Theorem 1 is a direct generalization of Landau’s proof (5*) for 
the case p=1. For the case =1, Landau’s method yields the inequality 


(18.2) D,(0) 2 1/(6M), 


a special case of inequality (18.9) to be proved below. But other related meth- 
ods(®*) yield the inequality 


(18.3) D,(0) = 1/(4M), 


which is somewhat sharper than (18.2) and which we shall therefore take as 
point of departure. 

We remark that if f(z) is analytic for |z | <1 with f(0) =0, f’(0) etnias, 
with | f(@)| <M for |z| <1, then the function f(z)/m has the derivative unity 
at the origin and modulus in |z| <1 not greater than M/|m|. Conse- 
quently under the transformation w=f(z)/m we have from (18.3) the result 
D,(0)2 | m| /(4M), and under the transformation w=f(z) we have 


| m*| 
4M 


Let us now suppose Theorem 1 established with » replaced by j for 
j=, 2,---+,p—1; we proceed to prove by induction the theorem as stated. 
The cases 


(18.5’) | 7’(0) | 


| #(0) | 1 
cole (12M)? 


(18.4) D,(0) 2 


| | 
(p—1)! 12M 
are all handled in a manner similar to the proof of (18.4). Thus in case 


(#) E. Landau, Sitzungsberichte der Kéniglichen Preussischen Akademie der Wissen- 
schaften, Berlin, 1926, pp. 467-474. 

(*) J. Dieudonné, Annales de !’Ecole Normale Supérieure, vol. 48 (1931), pp. 247-358, Or 
see Montel, Fonctions Univalentes, §37. 


(18. 


E 
. . . . . q 
E 


174 W. SEIDEL AND J. L. WALSH 


(18.5), 7=1, 2,---,p—1, the function 
S(2) 
f£(0)/j! 
has the jth derivative j! at the origin, and modulus in |z| <1 not greater than 
j\M/ | f(0)| , so by our assumption that Theorem 1 with p replaced by j is 


established, we have under the transformation w=j!f(z)/f(0), D;(0) 
=Mj(j!M/ | f(0) | ), and we have under the transformation w=f(z) 


(18.7) D0) 2 Olu 

j! | 70) | 
hence by the relation. D,(0)2D;(0) the theorem may be considered to be 
proved. It remains to study the case that we have simultaneously 


| £0) | 1 
j! (12M)?-? 


(18.6) 


(18. 5») j=1,2,---,g-1, 


with, of course, the relation f‘” (0) = p!. 
Suppose r can be chosen (0<r<1) so that the expression 


(18.8) max | f(z) — 2 | 


is positive. Then we have RSr?<r, and for | w| <R the inequality 
f(z) — 2 


2? — w 


<1 


holds on the circle |z| =r. Of course, z?—w cannot vanish on |z| =r. Then 
by Rouché’s theorem the function f(z)—w has precisely as many zeros in 
|z| <r as does the function z?—w, namely p. Then the transformation w=f(z) 
maps |z| <r onto a Riemann configuration which contains the region | w| <R 
with each point covered precisely p times. Thus (Theorem 15, Chapter IT), 
we have D,(0) 2 R, whether D,(0) refers to the Riemann configuration which 
is the image of |z| <r or to the configuration which is the image of |z| <1 
under the transformation w=f(z). 

It remains to show that r can be chosen in such a way that R as defined 
by (18.8) is positive. If we set f(z)=)>.*_,a,2", Cauchy’s inequality is 
|a,| <M, and in particular a,= 1< M. Consequently we may write on |z| =r 
by the use of (18.5) 

Mrrt! 


n=p+1 


< r r ( ) 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


and with the choice r=1/(4M), 
R=r?— | f(s) — 2? | 
1—r (12M) 1—12Mr 
r 1 — 
1— 12Mr 


7? — 


1 + 37? 
9.31.47. 


We have now proved the desired inequality R>0O and thus completed the 
proof of Theorem 1, and we also have material for obtaining an explicit in- 
equality for /,(M) in inequality (18.1). 

19. Numerical lower bounds for D,. When p=2, relation (18.7) [or 
(18.4) ] becomes in case (18.5’) 


1 
D0) 


whereas in case (18.5’’) we have from (18.9) 


2 
0) 
so in either case we may write 
(19.1) D,(0) 2 . 
Inequality (19.1) is to be generalized by proving 
1 


(19.2) D,(0) 2 M,(M) = 


We remark that M,(M), as thus defined, decreases monotonically as M in- 
creases. It is to be noticed that (19.2) holds for p=1, by inequality (18.3), 
and for p=2, by inequality (19.1); we assume (19.2) to hold with p replaced 
by j for j7=2, 3,---, p—1, and shall establish (19.2) as written. In case 
(18.5) we find from (18.7) and (19.2) the inequality (p> 2) 


1 


(19.3) D0) 2 


Direct comparison of the right-hand members of (19.2) and (19.3) now shows, 


175 
| 
} 
§ 
1 
{ 


176 W. SEIDEL AND J. L. WALSH [July 


by virtue of the inequality 2*-!2q, gq a positive integer, and by virtue of 
D,(0) = D0), that (19.2) holds in each of the cases (18.5), j7=1,2,---,p—1. 
Also in case (18.5‘”) inequality (19.2) is valid, as we find from (18.9), so we 
have established. 


Coro.iary 1. Under the hypothesis of Theorem 1, we have inequality (19.2). 


Needless to say, the numerical results contained in some of the preceding 
inequalities can be improved, and it is to be supposed that those contained 
in inequality (19.2) can be greatly improved. 

Inequality (18.7) is valid under the assumption f‘9(0) #0 instead of 
f‘(0) =7!, so by using M,(M) as defined by (19.2) we may formulate: 


CoroLiary 2. Suppose the functions f,(z) analytic for |z| <1, with f,(0) =0 
and |f,(2)| <M for |z| <1. If as k becomes infinite the corresponding sequence 
D,(0) approaches zero, then we have also 


lim (0) = 0. 
ko 


Under the conditions of Corollary 2 wehave D,(0) = D,_:(0) = - - - 2D,(0), 
from which follows for j=1, 2,---, p the relation 


(7) 


(19.4) lim f,"(0) = 0. 


A specific inequality for the direct proof of (19.4) is useful. A consequence 
of (19.2) and (18.7) for 7=1, 2, --- , p, with the omission of the requirement 
f?(0) is 


D0) 2 j 4-12%-2(j1M/| | 


The inequality D,(0) S$ M is obvious, so we have 
| f°) | 
i! 


< 24M eal 


< 24M [rey 


By virtue of the inequalities D{0) = D,_:(0) we may now write() 


1 1 
(19.5) | (0) | + < 24p(1 + 


(*) For 0<as1, M>0, we have M*51+M. 


_ 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


We state explicitly a major result: 


Coro.iary 3. If f(z) is analytic and in modulus not greater than M for 
|z| <1, with f(0) =0, then inequality (19.5) is valid for every positive integer p. 


For the purpose of Corollary 3, the factor 1+ M in the right-hand member 
of (19.5) may of course be replaced by M'-?”. 

20. A lower bound for the derivative of a circular product. The converse 
of Corollary 2 is false, as is illustrated by the sequence f,(z) =z, with p=2. 
The second derivative fj’/(0) vanishes for every k, yet D2(0) has the constant 
value unity, so the relation D,(0)—>0 is not satisfied. Indeed, in the general 
situation that f,(z) is analytic for | z| <1 with f,(0) =0 and | fiz) | =M for 
|z| <1, it is not to be expected that f(0)—+0 should imply D,(0)—0, for 
the latter relation by virtue of D,(0)2Dj,,(0) implies also D,(0)—0, 
l=1, 2,---, p—1 which by Corollary 2 implies (19.4), a relation which 
is not implied by the hypothesis and is indeed completely independent of 
the hypothesis. We should expect, then, that a relation in the opposite sense 
to Corollary 2 would necessarily involve the lower derivatives. We shall pro- 
ceed to prove 


THEOREM 2. Let the function w=f(z) analytic for |z| <1 map | z| <1 ontoa 
Riemann configuration with f(0)=0. Then there exists a positive constant y, 
depending on p but not on f(z) such that we have 


1 
(20.1) D,(0) s — + +51 


The proof of Theorem 2 is to be carried out in several steps, of which the 
first is 


TueoreM 3. Let w=g(z) analytic for |z| <1 map |2| <1 onto |w| <1 
counted precisely p times, or precisely m <p times, with g(0)=0. Then we have 


1 1 


where c, is a suitably chosen number depending on p but not on g(z). 
To be explicit, we prove (20.2) with c,=2-(#t»!, 
The most general function g(z) is of the form(*) 


8 


(20.3) w = go(2) = II 


except for a constant factor of modulus unity which does not affect the left- 
hand member of (20.2) and which we therefore suppress. In the case p = 1, the 


() T. Rad6, ibid. 


| 6,;| <1, 


177 | 

J 

4 

4 

} 

A 

4 

4 

| 

4 

| 

j 

4 


178 W. SEIDEL AND J. L. WALSH [July 


form (20.3) breaks down, but we have g:(z) =z, and (20.2) is fulfilled with c, 
replaced by unity, which is greater than c,=1/4. Henceforth, we suppose 
p22. 

We prove (20.2) by induction, assuming the validity of (20.2) with p re- 
placed by p—1 and proving (20.2) as written(®). Equation (20.3) can be ex- 
pressed in the equivalent form 


(20.4) = 
1 — a 


$1, 


where we have also 
= + +---, <4, 
= biz + des? +---, <1. 
The power series expansions of the second factor in the right-hand member of 


(20.4) and of its reciprocal yield by direct comparison of coefficients the two 
sets of equations 


= — da, 
be = a;(1 — a&) — 
bs a,a(1 ad) + a2(1 ad) — d3a, 


— + — a&) + — a&) — aya; 


a ~ 0, 


1 — a& by 
a a 


(*) The succeeding proof can be considerably shortened if no numerical estimate for cy is 
desired. The left-hand member of (20.2) is a continuous function of the numbers §; in the closed . 
limited point set |8;| $1, hence takes on a minimum value c,; we must prove cp>0. By the 
hypothesis in the induction, the minimum value zero cannot be taken on when one or several 
numbers 8; vanish, for then by (20.3) the left-hand member of (20.2) equals the corresponding 
sum with p replaced by some m <p for some function gm(z): gp(z) =2?-"gm(z). The minimum value 
zero cannot be taken on when all of the numbers 8; are different from zero cp2= | g’(0)| = | BiBs 

+ >O. Thus (20.2) is established. 


(20.5) 
b, 
a 
1 — a& be 
a2 = — — 
a? a 
(20.6) 1 — a& 
= — —— — - — 
a’ a? a 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


The following series of steps is a consequence of equations (20.5): 


= — aaa, b; = — aa, 
be — = a; — dea, be — a; = — aoa, 
bs — be& = de — azar, bs — = bok — aga, 
by — = — aya, by — = — 


| bs — a1] +| bs — +--- +] bp — 
S [| +] +] 
+ [| o2| + +--- +] a,| 
[| ar] +] +] — [| +] +--- +] 5,1] 
+ [| +] os) 


Cauchy’s inequality for the function g,-1(z) informs us that |a,| <1, so we 
may write 


| bi | +| de] +--- +] 
1—|a| | | 


(20.7) 


Case 1. |a| Scy1/2. For p22 we have |a| $1/8; so the last 
member of (20.7) is not less than 


Case II. |a| >cp-1/2. Here we replace each term of each of equations 
(20.6) by the corresponding absolute value. The resulting imequalities when 
lao member for member with k = ~—1 become (for abbreviation we write 
lax 


1 1 1 1 1 


qr-2 


+) +--- +] 2 cps 


The coefficient of | is here not less than the coefficients of | b2|, |bs|, -- - , 
| bp-1| ; so we obtain at once from a>cp_1/2 


179 
1—|a| | a| 
> 
5 


180 SEIDEL AND J. L. WALSH 


| bi] +| +--- + 


aP-! aP-? 


Cp~1\” 1 
-( 2 ) 


Theorem 3 is completely established. 

It is obvious that the choice c,=2-‘*+"! can be considerably improved 
by the present method alone. 

It is quite natural to divide the proof of Theorem 3 into two cases depend- 
ing on the size of | a|, comparing b; with a;_, when || is small and comparing 
b; with a; when |a| is large. For it follows from (20.4) that bs=a;1. when 
a=0 and that =|a,| when |a| =1. 

21. Numerical upper bound for D,. Theorem 3, of some interest in itself, 
is an important step in the proof of Theorem 2. Another preliminary proposi- 
tion is 

THEOREM 4. Let the function w=f(2) analytic in |z| <1 with f(0)=0 mapa 
smooth region R interior to | | = 1 onto the unit circle | w| <1 covered precisely p 
times or precisely m times, m <p. Then we have 


1 1 
(21.1) | | = > 0, 


where the number y, depends on p but not on R or f(z). To be explicit, we shall 
establish (21.1) with 2, 


~ We shall make use of the analyticity of f(z) only in R, not throughout the 
entire region |z| <1. 

Denote by z=h(Z) a function which maps the region | Z| <1 smoothly 
onto the region R of the z-plane, with h(0) =0. Then the function w=g(Z) 
=f[h(Z)] maps the region |Z | <1 onto the unit circle | w| <1 covered pre- 
cisely p times or precisely m<p times, with g(0)=0, so g(Z) satisfies the 
hypothesis of Theorem 3. 

Let us introduce the notation 

g(Z) = aZ+a,2Z?+---, 
f(s) = bys + bust , 
hZ) = 
We note that Cauchy’s inequality for the function h(Z) yields 


(21.2) <1, k=1,2,--- 


[July 
> Cp-1 1 (=) 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 181 


The coefficients of f(z) and g(Z) are related by equations that we now need 


to consider : 
g(Z) = 
= bi[d:Z + + deZ*+--- | 
(21.3) + + + +--+ | 


+ baldsZ + + +--+}? 


By equating coefficients of corresponding powers of Z we obtain 
a, = bd), 
Os = bids + bed, 
(21.4) a3 = bids + + 
ay = + + 2drds) + 3bsdids + bad, 
as = + + 2dads) + + 3dyds) + b4(4dsds) + bods, 


The law of the coefficients of the 5, in equations (21.4) is relatively simple, 
and is readily formulated in terms of the subscripts of the numbers a; and };, 
and involves primarily the partitions of the subscripts of the numbers a;. The 
precise law would be a needless refinement for our present relatively rough 
purposes. If we replace each 5; by unity, it is obvious from (21.2) that the 
function g(Z) in (21.3) is dominated by 


}? 


Z + 2Z? + 4Z* + 8Z4 + 

Then the sum of the absolute values of all the coefficients of all the numbers 5; 
in the first p of equations (21.4) is not greater than 14+2+4+ - - -+2?-1, 
which is less than 2”. Insertion in each of equations (21.4) of absolute value 
signs on the numbers a;, on the numbers 4;, and on the coefficients of the num- 
bers b; yields a corresponding inequality. When the first p of these inequalities 
are added member for member, there results the inequality 


so (21.1) with y,=2-‘#+»!-? is a consequence of Theorem 3. 
We are now in a position to prove Theorem 2; the trivial case D,(0) =0 
needs no further discussion and is henceforth excluded. Under the hypothesis 


| 

| 

| 

{ 

} 


182 W. SEIDEL AND J. L. WALSH 


of Theorem 2 the function 

S(2) 

D,(0) 

is analytic for |z | <1 and maps a smooth region R interior to | z| =1 onto 
the region | w| <1 covered precisely p times or precisely m<p times, with 


w,(0) =0. Theorem 4 applied to the function (21.5) yields at once inequality 
(20.1). Theorem 2 is established, and we may state the 


(21 5) W;(z) = 


CoroLiary. In Theorem 2 we may take 


The number 2-‘?+»!-? can obviously be greatly improved, even without 
change of method. 

Theorem 2 is stated in the form convenient for applications, but we have 
used in the proof the analyticity of f(z) not in the entire region | z| <1, only 
in a neighborhood of the origin. However, if f(z) is analytic in a region con- 
taining points for which | z| 21, the number D,(0) is to be defined as referring 
to the Riemann configuration which is the image of |z| <1 under the trans- 
formation w=f(z). Theorem 2 is false if the points | z| 21 are not excluded, 
as is shown by the example p = 1, f(z) =z. 

It is clear now that from Theorem 1 (with Corollary 1) and Theorem 2 of 
the present chapter, Theorem 19 of Chapter II may be obtained in the explicit 
form of inequalities. Indeed, we have 


TueoreM 5. Let f(z) be regular in |2| <1 and bounded there: 
| (2) | < M. 


Let {zn} (|@n| <1) be a sequence of points in |z| <1 and let wa=f(z,). Then, 
there exist two constants X, and A, of which X, depends on p alone, while A, de- 
pends on p and M so that 


2) k—v f(k—») 
D,(wn) (= | tn | ) f (Zn) 
(k — v)! 


S 


where D,(w,) 1s the radius of p-valence at the point w, of the Riemann surface on 
which w=f(z) maps the circle |z| <1. 


(21.6) 


The writers are not informed as to whether the exponent 2? in (21.6) is 
the best possible one. Here, and in improving the constants A, and A, already 
obtained, lie a number of interesting open problems. 

As a consequence of the second half of inequality (21.6) and the example 
of §9, Theorem 8 we may state 


THEOREM 6. Let the function Q(r) be defined and positive for 0<1r<1, with 
lim,.1 Q(r) =0. Let the positive integer m be given. Then there exist a function 


[July 
| 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 183 


w =f(z) analytic and univalent in |z| <1, continuous in |2| $1, and a sequence 
of points 2, with | <i, | —1, such that we have 


Dn(Wn) 


where w,=f(Zn). 


CHAPTER IV. FUNCTIONS WHICH OMIT TWO VALUES 


22. Inequalities for D,(w,) when | f(z,)| is bounded. Practically all the 
results of the Chapters II and III may be extended to the class of functions 
f(z) regular in the circle | z| <1 which in that circle differ from 0 and 1(57). 
To be more specific, suppose that f(z) is regular in the circle |z| <1 and that 
f(z) #0, 1 in |z| <1. Let {z,} (|2,| <1) be an arbitrary sequence of points in 
the circle so that | f(z,)| remains bounded for all . Under these assumptions 
what is a necessary and sufficient condition that (1—|2,|)*f(z,)—0 
(k=1, 2,---, p),? If we examine the proof of Theorem 19, Chapter II, we 
notice that absolutely no modification is necessary in order to extend this 
theorem to the case under consideration since we are again dealing with a 
normal family {¢,(¢)} which, due to the condition that | f(2n)| is bounded, 
does not contain the infinite constant. The proof of Theorem 19, therefore, 
may be repeated verbatim to yield 


THEOREM 1. Let f(z) be regular in |z| <1 and f(z) 0, 1 there. Let {z,} 
(|u| <1) be a sequence of points in |z| <1 such that |f(z,)| <M for all n. Then, 
a necessary and sufficient condition that 


lim | f(z.) | (1 — | 2n|)* = 0, 


for a fixed positive integer p is that 
lim D,(w,) = 0, 


where w, =f (Zn). 


Again as in the case of Theorem 19 it is desirable to give explicitly the 
relation between | f‘»)(z,)| (1—|z,|*)” and D,(w,). In view of §21, Theorem 5 
and Schottky’s theorem this relation is easily obtained. We use Schottky’s 
theorem in the following form(*): If f(z) is regular in |z| <1 and omits there 
the values zero and one, if f(z) =ao+aiz+ - - - , then there exists a positive con- 
stant A, independent of ao, 0, so that 


(22.1) | #2) | < [| ao] + 


in the circle |z| <0<1. 


(8) The case f(z) a, 6 may always be reduced to the above case by considering ¢$(z) 
= (f(z) —a)/(b—a). 
(%8) Cf. L. Bieberbach, Lehrbuch der Funktionentheorie, vol. 2, 2d. edition, 1931, p. 224. 


- 
k=12---,p, 
no 


184 W. SEIDEL AND J. L. WALSH [July 


Let us assume now that the hypotheses of Theorem 1 are satisfied, and: 
form the functions 


(22.2) = =). 


These functions are all regular in |¢| <1 and omit there the two values 0 
and 1. Furthermore, ¢,(0) =f(z,) =w, are bounded in absolute value by the 
constant M: 


| @n(0)| < M, 


Applying Schottky’s theorem in the form (22.1) to the functions @¢,({), we 
find that 


| a(t) | < [M+ 2]}4/0-© = 
in the circle |¢| <0<1. If we set now 
(22.3) = bn(65), 


we obtain a regular function g,({) in the circle | {| <1 which satisfies the in- 
equality | g,(¢) | <M¢ in the whole circle | ¢| <1. Finally we set 


(22.4) h,(f) gn(0), 
so that /,(¢) is regular in | | <1, h,(0) =0, and 


| | < 
Now, according to §21, Theorem 5, we have 


(p) 


1 1 
AyD, (0) (0) | (0)| + hn (0)| S Ay: [D,(0)]}?”, 


where D,(0) is the radius of p-valence (§14) at the point w=0 of the Riemann 
configuration R, on which w=h,(z) maps the circle | z| <1. From (22.4) we 
obtain 


1 
»D,(0) < | (0) | gi’ (0)| + Ap: [D,(0) . 


But g,({) maps lg | <1 on a Riemann configuration R,/ obtained from R, by 
translating it along the vector g,(0). Therefore, D,(0) is equal to the radius 
of p-valence of R, at the point w=g,(0). This radius we shall denote by 
D,[g.(0) ]. We thus obtain 


1 Por 
Dolea(0)] S| (0)| + >| + + g.” (0) | 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE _ 185 


By virtue of (22.3) this becomes 


(p) 


Dylgn(0)] (0) | oa’ (0)| + (0) | 
S Ay: [D,[gn(0) 


Now, the Riemann surface R,’ can simply be considered as the surface on 
which the function w=¢,({) maps the circle | {| <6. It is, therefore, merely a 
part of the surface R on which @¢,({), and by (22.2) w=f(z), maps the circle 
|z| <1. If we denote by D,(w,) the radius of p-valence of R at the point 
w=w,, we Clearly must have 


D, [gn(0) ] D,(w,). 


We may, therefore, infer the inequality 


(22.5) 


(p) 


6? 6” 
6| (0) | +5 | + (0) | < Ap: . 


Now, since 0<6@<1, we find 
1 1 A a 


P 
According to (22.2) and (2.3) we obtain 


(22.6) > 


kewl 


[Dy(w.)}. 


This gives us the desired inequality from above. The corresponding inequality 
from below, is contained in §20, Theorem 2: 


Pp 


(22.7) 


kewl 


(- 1) (k y)! 


We may, therefore, state the following 


THEOREM 2. Let f(z) be regular in | z| <1 and f(z) #0, 1 there. Let {z,} 
(|2n| <1) be a sequence of points in |z| <1 such that | f(z,)| <M for all n. Then, 
for any 0<0<1 there exist two constants dX, and A, of which dX, depends on p 
alone, while A, depends on p, M, and 8, so that 


(22.8) 


| 
| 
| 
| 

| 

| 
| 


186 W. SEIDEL AND J. L. WALSH [July 


where D,(w,) is the radius of p-valence at the point w,=f(z,) of the Riemann 
surface on which w=f(z) maps the circle | z| <1. 


Since from the form of A, it is evident that it tends to infinity as 6 tends 
to 1, the best value for the right side of (22.8) is obtained for that value of 0 
for which A,/@” attains its minimum. That value may be readily computed 
from the expression for A,. It is evident also that Theorem 2 implies Theo- 
rem 1. 

We remark that under the conditions of Theorem 2 we have D,(w) S| wl], 
so that (22.8) gives an inequality on the approach to zero of (1— | z| 2) kFCE) (2) 
as w tends to zero, for every k. 

A further consequence of Theorem 2 is that under the hypothesis of that 
theorem, an additional inequality of the form | f(z)| < M implies inequalities 
|f(z)| (1—|2|*)*< Mi, where M; depends only on k and M. Indeed, we have 
D,(w) M; our conclusion (5*) follows from (22.8). 

23. Counterexamples. In Theorems 1 and 2 an important part of the hy- 
pothesis was the fact that | f(zn)| <M for all n. Since any sequence {zn} can 
be decomposed into sequences on which | f(z,)| is bounded and those on which 
| f(2n)| tends to infinity, it is natural to inquire how far Theorem 2 can be ex- 
tended to sequences {z,} for which |f(z,)| 

That the conclusion of Theorem 2 as a proposition is false for such se- 
quences is a theorem which we shall establish: 


THEOREM 3. There exists a function f(z) with two omitted values and regular 
in |z| <1 and there exists a sequence of points {zn} (|2n| <1, +1) such that, 
setting Wn =f(Zn), we have Di(w,)—>0, ©, and yet limMn +0 | f’(Zn) | (1— | 2) 
= 8r. 
In the half-plane RW >0, where W = u+iv, 
R(W + = u+etcosx —e — 


Consequently, in RW>0 the function W+e-"+!+3 omits all values in some 
neighborhood of the origin, as does the function 


(23.1) w = f(z) = (W + e+! + 3)? = F(W), 


where we set W=(1+2)/(1—2), so that z is a point of the unit circle | z| <1. 
We choose W, =1+2nmi+1/n, whence e~¥*+!=e-"/* and find 

df(z) 
(23.2) = + + 3)(1 — 

dw 
Thus, f’(z) vanishes in the points where 1—e-”+!=0, namely W=1+ 2nz1, 
n=0, +1, +2,--~. If we define z, by the relation W, =(1+2,)/(1—2,), we 
find from (23.2) 


(*) More precise inequalities of this type were developed by O. Szdsz, loc. cit. 


FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


df (Zn) 
dw 
so that df(z,)/dW—47i. We next compute | 1—z|?=4/| W+1|*and |dW/dz| 
= | W+1| 2/2. Hence, 
dw 


1 2 
—| =2+——2. 
lente 2 n 


1 
= 2(4 + + — + cur) (1 = 
n 


Thus, we obtain finally 
| | (1 — | |*) 8x. 


It now remains to be shown that D,(w,)—+0. This may be shown as follows. In 
the W-plane consider the two points W=1+2nmi and W,=1+2nmi+1/n. 
Join these two points by a rectilinear segment, necessarily horizontal. This 
segment is mapped by the function w= F(W) on a certain arc lying on the 
corresponding Riemann configuration and joining the points w= (5+ 2n71)? 
and w,=(4+e-/"+1/n+2nmi)*, of which the first is a branch point of the 
Riemann configuration in question. It is clear, therefore, that this arc ema- 
nates from the center of the circle | w—w,| <D(w,) and terminates in a point 
lying exterior to or on the boundary of that circle(®). Hence, the length of 


this arc cannot be less than D,(w,). But the length can be estimated directly. 
Indeed, it is equal to 


1+1/n 
f | F’(2nmi + u)| du. 
1 


From (23.2) we find 


Di(wn) 2f | + + + 3| (1 — du. 
1 


Now, in the interval 1 Su 31+1/n, we have e~**' 31, 
so that 


(23.3) Di(wn) S 2(1 — + 5 + 1/n)-1/n. 


Hence, as n>, we have D;(w,)—0, which completes the proof of the theo- 
rem. 


In connection with the present example one may make two remarks. 
Remark 1. If one replaces the function f(z) in (23.1) by the function 


(23.4) f(z) = We=(1+2)/(1 — 2), 
(%) Study of the variation of arg (dw) on the arc shows that the arc lies wholly in the 


circle in question, and hence that D,(w,) = | F(W) —F(W,)| , where W=1+2nzi. A similar 
fact holds under Remarks 1 and 2. 


1942] 187 
{ 

| 


188 W. SEIDEL AND J. L. WALSH [July 


with W,+1=2nmi+1/n?, clearly the relation D:(w,)—0 still holds, while 
|f’(2n) |(1—|2,| 2) +0. Thus, Di(w,)—>0 does not even imply the boundedness of 
(1—| *). 

Remark 2. Let a be any real number in the interval 0<a<1. Choose an 
integer k so that k >a/(1—a). Then, the choice 


f(z) = (W + + W = (1+ 2)/(1 — 2), 


with W, =1+2n7i+1/n* yields Di(w,)—0. Indeed, a computation analogous 
to the one in the preceding example shows that D,(w,) = O(1/n*). On the other 
hand, |w,| =O(n**). Hence | w,| *-Di(w,) and this expression 
tends to zero. Furthermore, it is easily seen that | f'(@n)| (1-— | | 2)>c>0, 
where c is a certain positive constant. Thus, for the class of functions with a 
region of omitted values no relation | w,| «.Di(w,)—0 with 0<a<1 can imply 
(1—|2n]2)0. 

In the example of Theorem 3 and the examples in the two remarks it will 
be noticed that | w,| -D1(w,) does not tend to zero. The case that | w,| -D:(wa) 
tends to zero will not be treated in its full generality in this paper. A special 
case is considered in §25. The case | w,| «-D,(w,)—0 for a>1 will be consid- 
ered in the next section. 

24. Case: lim,.. | wa| (1+) D) (w,)=0. The following extension of 
Theorem 1 for p=1 to the case | w,|—> will now be proved: 


THEOREM 4. Let f(z) be analytic in | 2| <1 and omit two values there. Let 
{gn} (|Zn| <1) be a sequence of points in |z| <1 such that, setting w.=f(2n), 
we have limy.. | = 00, Then, the condition 
(24.1) lim | w,|'+*Di(w,) = 0 


for any positive ¢ implies 


(24.2) lim | | (1 — | |*) = 0. 


It is clear that the sequence of functions 


regular in | ¢| <1 is normal. Since by hypothesis lim, ... | w,| = ©, we have 
lim | ¢n(0)| = 

so that 

(24.3) lim | ¢.(¢)| = © 


uniformly in every closed subregion of | {| <1. 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


Choose a positive number 


whence 

1+ 

It follows from (24.3) that for n sufficiently large the function 1/¢,({) is regu- 
lar in the circle l¢| <1, where p: is any number such that p<: <1. Further- 
more, may be chosen so large that 1/ | <1 in | <1, which implies 
that log | a(t) | is harmonic and positive in | | Sp:. Then, using Poisson’s 
integral for the region | {| <p:, one sees immediately that 


log | ¢4(0) | 


p 


1 


(24.4) log | ¢.(¢)| 


in the circle lg] <p<p.. Now, by taking p: so near to unity that p1+p/pi—p 
<1+ and then by choosing sufficiently large, the inequality (24.4) implies 
| | < | ¢,(0) in the circle | ¢| Sp("). 

Now, according to Theorem 3 of Chapter II, 
| (0) |*r? 


where M,2 | . If we set r=p, M,= | =w,, we 
obtain for m sufficiently large 


D,(wn) 2 


8 | Wn 
from which the theorem follows at once. 
The treatment of the case for general p is quite analogous: 


Tueoreo 5. Let f(z) be analytic in |z| <1 and omit two values there. Let {zn} 
(|z.| <1) be a sequence of points in | z| <1 such that, setting w,=f(z,), 
lim, .. | w,| = 0. Then, the condition 
(24.6) lim | w,|(+?-D,(w,) = 0 


for any positive € implies 


(24.7) lim | f= 


(24.5) wn) = 


(“) The reasoning employed in the proof of this inequality is well known. Cf. A. Ostrowski, 
Abhandlungen des Mathematischen Seminars der Hamburgischen Universitat, vol. 1 (1922), 
pp. 327-350; S. Mandelbrojt, Comptes Rendus de l’Académie des Sciences, Paris, vol. 185 
(1927), pp. 1098-1100; H. Cartan, Annales de !’Ecole Normale Supérieure, (3), vol. 45 (1928), 
pp. 255-346; J. L. Walsh, Téhoku Mathematical Journal, vol. 38 (1933), pp. 375-389. 


189 | 
| 

€ 1 | 

2+. 

i 

¢ 

| 

f 

4 


190 W. SEIDEL AND J. L. WALSH [July 


The proof of Theorem 4 is repeated verbatim, and we find, as before, that 
for any positive ¢ there exists a positive number p <1 such that for all suffi- 
ciently large values of n 


| | <| 
in the circle |¢| Sp. 
Now, according to inequality (19.5’), we obtain 


Ld 


j 1 D n 1/9 
= | < 24| w,|** 


j=l \ | 


Ld 


j=l 
‘whence, applying (2.3), we find for » sufficiently large 


(= 1) she (1 — | |*) 


G-»)! 


Lid 


j=l 


Since (24.6) implies the relation 


lim | w, | = 0, 


we obtain (24.7). 
25. Mandelbrojt’s theorem. The following theorem is due to S. Mandel- 
brojt(®): 


THEOREM A. Let f,(z) be a sequence of functions analytic in a region R and 
tending uniformly in R to infinity. If there exists a positive constant M such that 
for alin and for allzin R 


(25.1) | arg fa(z)| << M 


with some determination of the argument, then to every closed region R, wholly 
interior to R there corresponds a finite positive number a (1<a<+) anda 
positive integer no such that for every pair of points zo and 2, in R, and for every 
n >No, the inequality 


< Fn(21) 


25.2 
( a 


a 


holds. 


(*) S. Mandelbrojt, loc. cit. 


| | 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 191 


We indicate a proof of Theorem A(*). Let us first prove the assertion of 
the theorem in the special case that R; is the circle C: lz —a| Sp lying wholly 
in R. It is clear by hypothesis that for sufficiently large values of m the func- 
tions f,(z) #0 in C, and henceforth we shall consider only such values of n. 
Hence, the functions 1 log f,(z) will be regular in C, single-valued in C after 
a particular determination of the logarithm is selected. We choose that deter- 
mination for which R[ —i log f,(z) | =arg f.(z), where the argument is the one 
asserted in (25.1). Now, take a circle C’: |z—a| Sp’ for which p’>p and 
which also lies wholly in R; choose m so large that f,(z) #0 in C’. 

In C’ we have the representation 


— log | f(a + | + log | fn(a) | 


(25.3) 1 2p’r sin (0 — ¢) 

arg + — dg. 
J + r? — 2p'r cos (6 — 

Let 29=a+7e* and z:=a+7,e* be any two points of C. We may, then, 

write (25.3) for the points z9 and 2, and subtract the sécond equation from the 

first. Thus, we obtain the equation 


1% p’ro sin (09 — ¢) 
log Fal20) = arg fn(a + p’e**) E + r — 2p'ro cos (6) — ¢) 
sin (0; — 


(25.4) 


r? — 2p'r; cos (01 — 
Taking absolute values in (25.4) and observing (25.1), we find 
In(20) 
M (** p’ro sin (00 — $) sin (0; — ¢) 
p’? + — 2p'ro cos (09 — p’? + — 2p’r; cos (6; — 


| log 


But since 2 and 2; lie in the circle C, an easy calculation shows that 
| lo fn(21) 4pp'M 

fn(20) (p’ — p)? 
The right-hand side in (25.5) is independent of the pair of points zo and 2. 
From (25.5) follows at once the assertion of Theorem A in the case that R; 
is a circle. 


We now pass to the general case. Let Ri be any closed region contained 
in R and itself containing R, in its interior. Consider the class of all open 


(25.5) 


() The proof of this theorem given by Mandelbrojt, loc. cit., is not clear to the writers. 
The proof given in the text was suggested to the authors by Professor S. E. Warschawski. 


| 

{ 

i 

j 

y 


192 W. SEIDEL AND J. L. WALSH [July 


circles with centers in R{ contained together with their boundaries in R. In 
accordance with the Heine-Borel theorem one may select out of this class a 
finite number of circles which cover Rj. Denote this number by N. By the 
first part of the proof with each one of these circles there is associated a num- 
ber a, (1 <a,< ©) and a positive integer \,, such that for any pair of points 2¢ 
and z; in that circle and for n>, 


1 f n(21) 
a, Sn (zo) 


Let B and mp be the largest of the numbers a, and X,, respectively. Then, for 
n>m, and for any pair of points in any one of those circles we have 


1 < In(z1) 
B fn(20) 


Now consider any two points zo and 2 in R;. Connect zo and 2 by a simple 
polygonal line P lying wholly in R{ and so chosen as not to be tangent to 
any circle of the above class. Denote by C; any circle of the above class which 
contains the point zo. As one travels along P from 2p to 2;, there will be a last 
point of intersection {, of P with the circumference of C,. Denote by C; any 
circle of the above class which contains {;. Between zo and {; on P choose 
any. point £; common to both -C, and C;. Now, starting with the point £, which 
belongs to C2, repeat the argument. We obtain in this manner a point & of P 
which is common to two circles C; and C; of the above family. Proceeding in 
this manner, after a finite number of steps we come to a first circle C; which 
contains the point 2. It is clear from (25.6) that for 2 >mo 


1 1 Sn(21) | | fn(E2) 


ay. 


(25.6) < B. 


< BY 


Setting 8B” =a, we obtain the constant asserted in Theorem A. 

As Mandelbrojt himself points out, these results may be readily extended 
to the case of a sequence of functions f,(z) regular in R which converges uni- 
formly in R to an analytic function f(z) in such a manner that the differences 
f(z) —f(z) do not vanish in R. 

Theorem A may be used to obtain a result related to Theorem 4. 

THEOREM 6. Let f(z) be analytic in |2| <1 and omit two values there, includ- 
ing the value w=a. Let {z,} (|z,| <1) be a sequence of points in |z| <1 such 
that, setting Wn =f(2n), we have limy.. | wn| = ©. If arg [f(z)—a] is uniformly 
bounded in |z| <1(*), then the condition 


lim | w, | -D;(w,) = 0 


(*) Geometrically, this condition means that the Riemann surface on which w=f(z) maps 
the unit circle |2| <1 does not wind infinitely many times about the point w=ca. 


-- 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 
implies the relation 

lim | f’(¢n) | (1 —| = 0. 


The boundedness of arg [f(z)—a] implies the boundedness in |{| <1 of 
arg [¢,(¢)—a], where 


f+, 
6.0) = =). 


Just as in the proof of Theorem 4 we infer that 


lim | | = 


uniformly in every closed subregion of | {| <1. Hence, 
lim | — a| = 


uniformly in every closed subregion of | ¢| <1. We may therefore apply Theo- 
rem A of Mandelbrojt to the sequence of functions ¢,({)—a in the circle 
lg | <p, where p is any fixed positive number less than unity. It follows that 
corresponding to any circle l¢| Spi <p one may assign a finite positive num- 
ber a and a positive integer m» such that for any pair of points {, fo in | | Spi 
1 

< 


a $n(So) 
for every n>mnpo. In particular, choosing {9=0, we obtain the inequality 


| — a| < @| ¢,(0) — 
and 


(25.7) | | < a| + (@ + 1)| = w,.| + (2 + 1)Ja| 


in Sp: provided n>npo. 

Thus, one may apply Theorem 3 of Chapter II where M=M,=al w,| 
+(a+1)|a|. The theorem follows at once. Conditions more delicate than 
those given in Theorems 4 and 5 may be obtained by different methods. Thus, 
it may be shown that the condition (24.1) may be replaced by the less strin- 
gent condition 


lim | w,| (log | w,| )'+*D,(w,) = 0. 


This result, and other analogous ones, will be developed in a later joint paper 
of A. S. Galbraith, W. Seidel, and J. L. Walsh. 

26. Counterexample for unrestricted functions. In obtaining relations be- 
tween | f’(z)| (1—|s|) and D:(w) we have always restricted the class of func- 


193 
\ { 


194 W. SEIDEL AND J. L. WALSH {July 


tions f(z). We have thus far considered univalent functions, bounded func- 
tions, and functions omitting two values. That these or similar restrictions are 
essential is shown by the following example. 


THEOREM 7. There exists a function f(z) analytic in the unit circle | | <1 and 
a sequence of points <1, |zn|—>1) such that, setting wn =f(zn), we have 
D,(w,)—0, | wn| bounded, and 


lim | | (1 — | | = 4x. 


Consider the function 
w = f(z) = sin? W, 
where W=(1+2)/(1—z). It follows that 
2 sin 2W 


Let us set 
1/n + 2nn — 1 — 1 


Zn 


We find 


— 3,) = (1 + sin 
n 


n 


lim f’(n)(1 — = 


On the other hand, setting w, =f(z,), it is clear that D,(w,) cannot exceed 
the length of the image of the segment joining the points ¢, and z,, since the 
point ¢, is mapped onto a branch point of the Riemann surface. The length 
of this image is given by the integral 


_ 2 sin 2W 2 


(1/m + + 1)? 
n(1/n + + 1)(2ne + 1) 


Hence, 


1/n + +1 
S lim D;(w,) = 0. 
Finally w, =sin? (1/n+2xn) =sin? 1/n, so that lim,.., w, =0. This com- 
pletes the proof of the theorem. 
The idea of this example, as well as of the examples of §§12 and 23 is the 


1 
n 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 195 


following. It is not true, as is well known(*), that f,(z) analytic for | z| <4, 
fa (0) =1, f,(0) =0, implies that w=f,(z) maps | z| <1 onto a Riemann con- 
figuration which contains in its interior a fixed smooth circle whose center 
is at the origin. The simplest counterexample is perhaps 


fr(z) = 2 — nz?, 


The derivative f, (z)=1—2nz vanishes for z=1/2n and the corresponding 
value of w is f,(1/2n) =1/4n, which approaches zero. 

This example indicates that the phenomenon of a branch point’s ap- 
proaching the origin is not dependent on the transcendentality of f,(z), or 
even on the possibility that an ever-increasing number of sheets of the image 
of | z| <1 should come together. It is a matter primarily of having the image 
of a point at which f,/ (z) vanishes approach the origin. The examples men- 
tioned above were constructed with this idea in mind. 


CHAPTER V. MISCELLANEOUS 


27. Limit values of analytic functions. The methods developed in the pres- 
ent paper have close connections with the general subject of limit values of 
functions analytic in the unit circle, including various theorems due to 
Lindeléf and to Montel. We proceed now to discuss such connections. 


THEOREM 1. Let the function f(z) be analytic for | z| <1 and omit two values 
there. Suppose for the sequence {2,} with |z,| <1 we have lim, .. f(x) =a, where 
a ts finite or infinite. Let the non-euclidean distance p(Zn, Zn ) between 2, and 2, 
approach zero as n becomes infinite, with | Sa | <1. Then we have lim,... f(z. ) =a. 


We define as usual the functions g,(¢): 


(27.1) = 
whence g,(0) =f(z,). If we set 
(27.2) 

1 + 
we have ) =f(z, ), and the non-euclidean distance 
(27.3) p(0, = Bn ) 


approaches zero as » becomes infinite. The family g,({) omits two values in 
|¢| <1, hence is normal there. Given any infinite sequence of indices n, there 
can be extracted a subsequence for which the corresponding functions g,(f) 
converge for | {| <1, uniformly in every closed subregion, to some limit func- 


(*) See, for example, P. Montel, Lecgons sur les Fonctions Univalentes ou Multivalentes, 
Paris, 1933, p. 121, where a different example is given. 


196 W. SEIDEL AND J. L. WALSH [July 


tion g({), with g(0) =lim,... g.(0) =a. The approach of zero to p(0, £,/ ) implies 
the approach to zero of {[,! ; so for the subsequence of indices considered the 
uniformity of convergence yields lim,... g.(f, )=a. Thus from any subse- 
quence of the sequence { f(z. )} can be extracted a new subsequence converg- 
ing to the limit a, which implies the conclusion of Theorem 1. 


THEOREM 2. Let f(z) be analytic for |z| <1 and omit two values there. Let 
the sequence \z,} with |z.| <1 have the property that lim, .. f(Zn) =a, where ais 
finite. Then a necessary and sufficient condition that the sequence {z,} be regu- 
lar(®) is 
(27.4) lim g,(f) = for |¢| <1, 


uniformly in every closed subregion, where g,({) is defined by (27.1). 


Let a sequence {z,/ } be given for which p(z,, 2,/) is bounded. 

Again we define by £,/ equation (27.2), from which it follows that (27.3) is 
valid, and the non-euclidean distance p(0, {,/ ) is bounded. The sufficiency of 
(27.4) is obvious, for (27.4) implies that g,({,/ )—a, which is the conclusion 
to be established; we note that here the dX of §11, Definition 1, can be taken 
arbitrarily large. We proceed to show the necessity of (27.4). 

If the sequence {zn} is regular but (27.4) is not satisfied, there exists a 
sequence of indices m, such that limz.. gn,($) =go(f) for |¢ | <1, uniformly in 
every closed subregion, where go({) is analytic but not identically equal to a 
in l¢| <1. Suppose for definiteness go({o)a, where the non-euclidean dis- 
tance p(0, fo) is less than the.’ of §11, Definition 1. If we define z, by the 
equation 

fo + 2n 


1+ 


we have 
= golto) a, plan, 2!) = p(0, $0) <2, 


contrary to hypothesis. 

In Theorem 2 we have for simplicity assumed that f(z) omits two values 
in | z| <1. It is obviously sufficient if f(z) omits two values in the non- 
euclidean circle with non-euclidean center z, and non-euclidean radius pp, 
where p, has a positive lower bound as m becomes infinite. A similar remark 
applies to the later results of the present section. 

A consequence of the foregoing remark is that if f(z) is analytic for | 2| <i, 
if |z,| <1, if lim... (zn) =a, where a is finite, and if the sequence {2} is ir- 
regular, then f(z) has at most one omitted value in each set of non-euclidean 
circles with non-euclidean radius p,, where p, has a positive lower bound. We 


(*) For the definition of regularity see Definition 1 of §11, Chapter II. 


hi 
> 
q 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 197 


consider pathology in more detail in §30. In Theorem 2, we have assumed the 
finiteness of a. A result without this restriction appears in 


Coro.iary 1. Let f(z) be analytic for || <1 and omit two values there. 
Let the sequence {2,} with |2,| <1 have the property that lim, f(2n) = ©. Then 
if |z/ | <1 and if the non-euclidean distance p(2n, 2.) is bounded, we have also 
lim, = ©. 


Since the functions g,({) form a normal family in ira <1 and since 
gn(0)— ©, we have lim,.. g.(f) = © in \¢| <1, uniformly in every closed 
subregion. Our conclusion is an immediate consequence. We turn to another 
result. 


Coroiary 2. Let f(z) be analytic for |z| <1 and omit there two values in- 
cluding the value a. Let the sequence {z,} with |z,| <1 have the property that 
lim, f(Zn) =a. Then if | <1 and if the non-euclidean distance p(2n, 24) is 
bounded, we have also lim, ) =a(®). 


From every infinite subsequence of the set g,({) defined by (27.1) can be 
extracted a new subsequence which converges for | {| <1, uniformly in every 
closed subregion. The limit of this new subsequence is a in the point £ =0, 
hence by Hurwitz’s theorem is identically a in |{|<1. Then we have 
+0 ga(f) =a for ¢| <1, uniformly in every subregion. Our conclusion 
follows as in the first part of the proof of Theorem 2. Thus the sequence 
{z,} is regular, and the number ) of §11, Definition 1, may be chosen arbi- 
trarily. A generalization of Theorem 2 is 


Coro.iary 3. Let f(z) be analytic for | z| <1 and omit two values there, and 
let the sequence wW,=f(2n) with |zn| <1 be bounded. Then a necessary and suffi- 
cient condition that the sequence {2_} be regular is 


(27.5) lim [gn(¢) — gn(0)] = 0 for |¢| <1, 


uniformly in every closed subregion, where g,({) is defined by (27.1). 


If the sequence {z,} is regular, it follows that from any subsequence of 
the g,({) can be extracted a subsequence such that lim,.. g.({) exists 
for l¢ | <1, uniformly in every closed subregion; for this subsequence 
limn.. gn(0) =a exists and by Theorem 2 the relation (27.5) holds for that 
subsequence. Thus, from any subsequence of the g,({) can be extracted a new 
subsequence such that (27.5) holds for that subsequence; so (27.5) itself is 
satisfied. 

Conversely, if (27.5) is satisfied, and if p(zn, zn ) =p(0, {x ) is bounded, it 
follows that lim,.. [ga({/) — gn(0)]=0, so the sequence {zn} is regular. Of 


(*7) This result for bounded functions was established by a different method by one of the 
present authors: W. Seidel, these Transaction, vol. 34 (1932), pp. 1-21; especially Theorem 3, 
p. 10. 


. 
| 


198 W. SEIDEL AND J. L. WALSH [July 


course, this latter conclusion is independent of any assumption that f(z) omit 
two values. 
Two further propositions relate Theorem 2 to the results of §§17 and 22. 


Coro.iary 4. Let the function f(z) be analytic in | 2| <1 and omit two values 
there, and let the sequence {f(2n)} be bounded, |z,| <1. A necessary and suff- 
cient condition that {z,} be a regular sequence for f(z) is 
(27.6) lim | | (1 — | )* = 0, k=1,2,3,---. 

From the sequence f(z,) can be extracted a subsequence f(z,,) which ap- 


proaches a limit a. A necessary and sufficient condition for (27.4) for the se- 
quence {n;} is 


(27.7) lim ga, (0) = 0, k=1,2,3,---, 

since the functions g,({) form a normal family in | ¢| <1. Equations (27.6) are 

equivalent to equations (27.7) if the latter are assumed to hold for a suitable 

subsequence {n,;} of an arbitrary sequence of indices. 


Coro iary 5. Let the function f(z) be analytic and omit two values in | 2| <1. 
A necessary and sufficient condition that {2,} be a regular sequence for f(z), 
where we assume W, =f(Z,) bounded, is 


lim D,(w,) = 0) ?=1,2,3,---. 


JP ~ 


Corollary 5 follows from Corollary4 by virtue of our fundamental Theo- 
rem 1 of §22. “ 
A further consequence of Corollary 5 is 


Coro.ary 6. Let the function f(z) be analytic in |2| <1 and omit two values 
there. If the sequence of points w,=f(2,), with | 2. | <1, approaches a finite 
boundary point of the Riemann configuration on which w=f(z) maps | z| <1, 
then the sequence {z,} is regular. 


It is worth remarking that Corollary 2 is a conséquence of Corollary 5 
or Corollary 6, without the use of Hurwitz’s theorem. 

Theorems 1 and 2 are of particular interest if the function f(z) approaches 
a limit along an arc. 


THEOREM 3. Let f(z) be analytic in |z| <1 and omit two values there. Let the 
Jordan arc C lie in | z| <1 except for the end point z=1. Suppose 


(27.8) lim f(s) = a, 


2 on 


where a is finite. Then any sequence {zn} on C for which z,—1 is regular. 


— 
4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 199 


From any subsequence of the sequence g,({) defined by (27.1) can be ex- 
tracted a new subsequence converging to some function go(¢) for | | <1, uni- 
formly for Sd <1. Let h be arbitrary, 0<A< Let be a point of C 
between the points z, and z=1 with p(z,, 2, ) =; such a point 2, exists with 
limn.o Zn =1. If f, is defined by (27.2), we have p(0, ¢,) =h. The equation 
(27.8) implies lim,.. f(z.) =a, whence lima... ga(f) =a. On each circle 
|¢| =d<1 lies a sequence of points {,/ for which g,({,/ ) approaches a, so on 
each such circle lies at least one point { at which go(f) =a. Consequently 
go(f) =a in |g] <1, every limit function of the sequence g,({) is identically a, 
this limit is approached by g,({) itself throughout l¢| <1, uniformly in any 
closed subregion; our conclusion follows from Theorem 2. : 

The method of proof of Theorem 3 establishes also the following: Let f(z) 
be analytic in |z| <1 and omit two values. Let 2,1, with |z,| <1, and let the 
non-euclidean distance p(Zn, 2n41) approach zero. If limn.« f(2n) exists, then the 
sequence {z,} is regular. By way of proof, we need merely modify the proof 
of Theorem 3 by considering instead of the arbitrary circle l¢| =d <1 an ar- 
bitrary annulus 0<d,s Sd;<1; each such annulus contains‘a sequence 
of points {,’ for which g,({,) approaches a, so each closed annulus contains 
at least one point £ in which go({) =a. 

The method of proof of Theorem 3 can be used to prove still another 
proposition: Let f(z) be analytic in | z| <1 and omit there two values. Suppose 
for real z we have lim,.1 f(z) =a. Then we have uniformly for approach within any 
triangle in |z| <1 


lim —|2|)* =0. 


In this proof, we need merely choose r, 0<r <1, and the sequence of real z, 
in such a way that under the transformation z = ({+2,)/(1+2,) each point z 
of the given triangle corresponds to some { in |{| <r. Various extensions of 
the proposition by the present methods suggest themselves, and are left to 
the reader. 

We shall introduce the notion of the non-euclidean Fréchet distance between 
two curves. Let C, and C; be two open Jordan arcs lying in | z| <1. Consider a 
topological map T of C; on C;. Denote by Fr(Ci, C2) the least upper bound 
(finite or infinite) of the non-euclidean distances between points of C, and C; 
which correspond in the map 7. The greatest lower bound (finite or infinite) 
of the quantities Fr(C,, C2) for all possible maps T will be called the non- 
euclidean Fréchet distance F(C;, C2) between C, and C2. With this definition we 
prove 


TueEorem 4. Let f(z) be analytic in |2z| <1 and omit two values there. Let 
Ci and C, be Jordan arcs which, except for the common end point 2=1, lie in 
|z| <1, and let F(Ci, Cz) be finite. If 


W. SEIDEL AND J. L. WALSH 


lim f(s) = a, 


z—l;zon(; 
where a is finite or infinite, then also 


lim f(z) =a. 
2 on C2 


To any sequence z, on C; which approaches z = 1 corresponds a sequence 
z, on C, such that p(Zn, 2. ) is bounded. If a is finite, our conclusion follows 
from Theorem 3. If @ is infinite, it follows from Corollary 1 to Theorem 2. 

If the two Jordan arcs C; and C; of Theorem 4 are tangent and have the 
same order of contact with |z| =1 at s=1, then F(Ci, C2) is finite. For trans- 
form by a linear transformation of the complex variable the region |z| <1 
onto the upper half of the w (=x+7y)-plane, so that z=1 corresponds to 
w=0. We shall assume that in the neighborhood of w=0 we may set up a 
one-to-one correspondence between the arcs Ci: y =y:(x) and C2: y =ye(x) by 
means of the ordinates x = constant. The non-euclidean distance between cor- 
responding points of the two curves reduces to 


| yi(x) 
ya(x) 


which by the assumption on order of contact is bounded. In studying the 


finiteness of F(C:, C2), we may confine ourselves to the neighborhood of the 
point z=1, so under the present hypothesis F(C;, C2) is finite. 

If the two Jordan arcs C; and C; of Theorem 4 are, except for the point 
z=1, contained in the lens-shaped region between two hypercycles through 
z= +1, and possess tangents at the point z=1, we may set up a one-to-one 
correspondence between their points by the circles of the coaxial family de- 
termined by z= +1 as null circles. Transformation of an arbitrary circle of 
that family into the axis of imaginaries by a transformation which leaves in- 
variant z=1, z= —1, and | z| =1, as well as the two given hypercycles, shows 
that F(Ci, C2) is finite. Thus we have the 


CoroLuary. The condition of Theorem 4 that F(Ci, C2) be finite is satisfied 
if C, and C, are tangent and have contact of the same order with | 2| =1atz=1, 
or if Ci and Cz possess tangents at =1 but neither is tangent to | =latz=1. 


The foregoing discussion has intimate connections with well known results 
on the limit values of analytic functions. The proof of Theorem 2 establishes 
the uniformity for all z,’ of lim,..f(z: ) provided merely p(z,, 2n ) is uniformly 
bounded. With this addition, Theorem 4 and its corollary include the theorem 
of Lindeléf that if f(z) is analytic in |z] <1 and omits two values there, and if 
lim,.1 f(s) exists for approach along a line segment in |z| $1, then that limit 
exists uniformly for approach within an arbitrary triangle contained in 


200 [July 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 201 


|z| $1. Likewise the corollary to Theorem 4 includes the theorem of Montel. 
that if f(z) is analytic in | z| <1 and omits two values there, and if lim,.; f(z) 
exists for approach along the arc of an oricycle, then that limit exists uni- 
formly for approach between any two arcs of oricycles tangent at z=1 to the 
original arc. 

We add the general remark that the method of the present section seems 
to have further wide use in the study of limit values of analytic functions; for 
instance this method easily proves that if f(z) is analytic and bounded in 
|z| <1, continuous on |z| =1 or an open arc A of |z| =1 with =1 as an end 
point, and if on this arc lim, .1, ¢ on a {(2) =a, then also the limit of f(z) is a uni- 
formly as |z|—+1 between A and the axis of reals. 

28. Extension of Bloch’s theorem. Another application of the results of 
Chapter III deals with an extension of Bloch’s theorem. We prove the follow- 
ing, which for p=1 reduces to Bloch’s theorem. 


THEOREM 5. Let w=f(z) be regular in |2| <1 and let f‘»(0) =1. There exists 
an absolute positive constant By, independent of the function f(z) so that the 
Riemann configuration Ry; on which w=f(z) maps the circle | z| <1 contains at 
least one point wo for which D,(wo)2B,. The constant B, may be taken. equal to 
where M,/p!, Mp=M,(M) being the constant of Theorem 1, Chap- 
ter III, taken for M=2?-p!. 


We assume first that f(0) =0 and that f(z) is regular in | z| 31. Let 
M,(r) = max | f°(s)|. 
We have M,(0) =1 and the function M,(r) is continuous and non-decreasing 
in the interval 0 Sr 31. The function 
o(r) = (1 — r)?M,(r) 


is also continuous in 0SrS1 and ¢(0) =1, ¢(1) =0. Hence, there exists a 
number fro (0 S7ro9<1) such that $(ro) =1 and <1 for The func- 
tion | f‘»(z)| attains the value M,(ro) at a point zo of modulus ro: 


(28.1 (P)(go) | = M,(%) = - 
(28.1) | | = Malt) = 
Consider a circle y of center zo and radius p=(1—1r9)/2 and the function 


+ pS) — flo) _ 


for suitably chosen constants a1, az, +++ . It is regular in |¢| $1 and 


+ 
f°? 


= 


\ 
| 
~ 


202 W. SEIDEL AND J. L. WALSH [July 


Now, in the circle | ¢| $1 we have |zo+p¢| Sro+(1/2)(1—10) =(1/2)(1+10) 
and therefore in |¢| <1 


1 + 1 2? 
2 (1 — (1/2) (1+ 10))? (1 — 10)? 


| + pt) | u,( 


Hence, in view of (28.1) 
| g(t) | < 2° 
for |¢| <1. Successive integration shows that 
(28.2) | < 2° 
for |¢| $1 and we also have 
(28.3) g(0) = 0, g?)(0) = 1. 


Now it was shown in Theorem 1, Chapter III, that for the class of functions 
satisfying the conditions (28.2) and (28.3) 


M(2°-p!) _ 


(28.4) D,(0) 2 


Consequently, by the definition of g(¢) it follows that for the function f(z), 
setting 


D,( wo) = p?| | = - 


The condition that f(z) be analytic in the closed circle |z| <1 may now 
be lifted. Indeed, let f(z) be assumed to be analytic in | z| <1. Then, if risa 
value in the interval 0 <r <1, the function 


1 
F(z) = — f(rz) 

rP 
is analytic for |z| $1 with F‘»(0) =1, Furthermore, we have 

1 

Hence, since the theorem applies to F(z), there exists a Zo (| Zo| <1) so that 
1 
B,s =, 


Now allowing r to approach one, we obtain the theorem in the general case. 
A lower bound for B, may be obtained from the estimate in (19.2). This 
value, however, is certainly not sharp. 


" 

| 

fs 

a 

a 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 


As a matter of record, we formulate without proof the 


Coro.ary. Let the function w =f(2) be analytic in |2z| <1, with 


1 1 
| #’(0) | = m. 


There exists a positive constant B,; independent of m and f(z) such that the Rie- 
mann configuration Ry; onto which w=f(z) maps the region | z| <1 contains at 
least one point for which D,(wo) =mB, . In fact, we may choose B, as the small- 
est of the numbers j!-B;/p, 7 =1, 2, +++, p, im the notation of Theorem 5. 


29. Unrestricted functions; properties of A(z). From the example of §26 
it is clear at once that one cannot obtain a relation between D,(wo) and 
| f' (20) 1¢ 1— | Z| *) without some restriction on the class of functions f(z) to be 
considered. It is perhaps not without interest to remark that by introducing a 
new quantity A(z) one may obtain relations of the desired kind without im- 
posing any restriction on f(z) other than analyticity in the unit circle |z| <1. 
In fact, we prove 


THEOREM 6. Let w=f(z) analytic for |z| <1 map |z| <1 onto a Riemann 
configuration S. Let 2» be any point of the circle | z| <1 which is mapped by 
w=f(z) onto a point wo of S which is not a branch point of S. Denoting by A(zo) 
the radius of the largest circle of the {-plane with center § =0 in which the function 


+ 2 ) 
=f 1+ 
1s univalent, the inequality 


1 Dy(wo) D,( wo) 


holds. In particular, for a sequence of points {2,} (|zn| <1) @ necessary and 
sufficient condition for 


(29.2) lim | | (1 — | |*) = 0 


(29.1) 


1s 
D,(wp) 


(29.3) = 


0, 


where w, =f(Z,). 


The function f =¥(w) inverse to w=¢({) is univalent on S, in particular 
univalent for |w—wo| <D,(we). Therefore, by Koebe’s distortion theorem it 
must map the circle | w —wo| <D,(wo) onto some region of the {-plane within 


204 W. SEIDEL AND J. L. WALSH 


which $(¢£) is univalent and which contains in its interior the circle 
| < (1/4) | | -Ds(wo), 


whence 
A(z0) 2 (1/4) | ¥/(wo) | -Da(wo). 


By the relation 1/y’(wo) =’ (0) =f’ (zo) (1—| zo| 2) we obtain the left-hand 
side of inequality (29.1). 

Similarly, the function w=¢(z) is univalent for lg | <A(zo), hence again 
by Koebe’s distortion theorem maps smoothly | {| <A(zo) onto a region con- 
taining the circle | w—wo| <(1/4) |¢’(0)| -A(zo). Hence, 


= (1/4) | | -A(z0), 


and the right-hand side of inequality (29.1) follows directly. 

Next, the equivalence of the relations (29.2) and (29.3) follows from (29.1) 
provided w, are not branch points of S. Indeed, if wo is a branch point of S, 
the expression D,(wo)/A(zo) has no meaning since both numerator and de- 
nominator are zero. We observe, however, that if a sequence of points z, for 
which the corresponding points w, are not branch points of S converge to a 
point Zo (with | z9| <1) for which the corresponding point w is a branch point 
of S, then by the first inequality of (29.1) 


no A(Zn) 


= 0. 


Hence, it is reasonable to define D,(wo) /A(z) as zero when wp is a branch point 
of S. With this convention the equivalence of (29.2) and (29.3), as well as the 
inequality (29.1), remain valid even in the case of branch points. 

30. Pathology. There are several fairly obvious extensions of our funda- 
mental Theorem 2 of Chapter IV to the effect that if f(z) is analytic and omits 
two values in |z| <1, if {z,} is a sequence of points in | z| <1, and if the num- 
bers w, =f(Z,) are bounded, then the two conditions 


(30. 1) D,(w,) rte 0, 


(30.2) — |?) 40, 


are equivalent in the sense that each inaplies the other. The mere analyticity 
of f(z) insures that (30.2) implies (30.1); so we are concerned at present only 
with the condition that (30.1) shall imply (30.2). Thus it is sufficient for (30.1) 
to imply (30.2) if we replace the condition that f(z) omits two values in 
|z| <1 by the condition that 


1 + 2,5 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 205 


shall omit two values in | ¢| <r <1, where r is independent of n; no essential 
change in the original reasoning is necessary ; compare §27, Corollaries 4 and 5. 
It is obvious too that (30.1) implies (30.2) provided from each subsequence z,, 
of the z, can be extracted a new subsequence zm, for which there exists a posi- 
tive number r<1 such that the corresponding functions ¢,, ({) defined by 
(30.3) have two omitted values in \¢| <r; for under such circumstances the 
fulfillment of condition (30.1) implies that for no subsequence z,, does the 
expression 


— | ) 


approach a limit different from zero, whence (30.2) is satisfied. For instance, 
it may occur that the functions ¢2,({) have the exceptional values 0 and 1 


in lg] <1/2, and that the functions ¢2,4:(¢) have the exceptional values 2 
and 3 in |¢| <1/4. 


DEFINITION. Let the function f(z) be analytic for |2| <1, let {za} be a se- 
quence of points in |z| <1, let wa=f(2n) approach a finite limit, let (30.1) be 
satisfied but suppose 


(30.4) lim f’(z,)(1 — | zn |) = 0; 


then we shall say that {2,} is a q-sequence. 
The discussion we have already given yields 


THEOREM 7. Under the hypothesis of the italicized definition, let {z,} be a 
g-sequence. Then from no subsequence {Z,} of the {zn} can there be extracted a 
new subsequence {2m,} such that the functions m,({) defined by (30.3) have two 
exceptional values in any region | ¢| <r<1, where r is independent of my. 


In other words, if {z,} is a g-sequence, then for every r, 0<r<1, and for 
every infinite sequence of subscripts {n,}, the functions $n,(¢) have at most 
one exceptional value in | {| <r. 

Some consequences of Theorem 7 are more conveniently described after 
transformation of | z| <1 onto a half-plane R(z’) >0. 


THEOREM 8. Under the hypothesis of the italicized definition, let {2,} be a 
q-sequence having as limit the point zo, with | zo =1. Let the region | z| <1 be 
transformed by a linear transformation onto R(z') >0 so that z =z corresponds 
to 2’ =0. Then there exists a half-line L from 2’ =0 in the closed region R(z’) 20 
possessing the property that if Sis a sector (of a circle) containing L in its interior 
and with vertex in 2’ =0, of arbitrarily small radius, then in S the transform of 
the function f(z) has at most one exceptional value. 


Let the points z,’ (necessarily approaching z’ =0) be the transforms in the 


\ 
| 
| 


206 W. SEIDEL AND J. L. WALSH [July 


z’-plane of the points z,. The numbers 
6, = arg Zn, —r<n<r 


have at least one limit value, say 0 = 09; the half-line L may be chosen as 0 =@o, 
as we shall proceed to prove. 

A non-euclidean circle in the z’-plane whose non-euclidean center is z’ =a, 
R(a) >0, is transformed by shrinking or stretching the plane with 2’ =0 fixed 
into a non-euclidean circle of the same radius, for the transformation leaves 
the region R(z’) >0 invariant. Let S be given, and let S’ be a sector interior 
to S whose sides are also interior to S, likewise having z’ =0 as vertex, and 
containing L in its interior. Then an infinity of points z,’ lie interior to S’. 
Let p denote the smaller of the two non-euclidean radii of the two circles 
whose euclidean centers lie on the respective rays bounding S’ and which are 
tangent to S; the circles are not uniquely determined but their non-euclidean 
radii are uniquely determined; there is an exceptional situation here, which 
presents no inherent difficulty and’ whose treatment is left to the reader, if 
the half-line @=2/2 or 6 = —2/2 lies in or on the boundary of S. The non- 
euclidean circles whose common non-euclidean radius is p and whose euclidean 
centers are the infinity of points z, interior to S’ all of whose interior points 
are interior points of S. Theorem 8 now follows from Theorem 7. 

It is obviously true that in S the function f(z) takes on every value with 
at most one exceptior an infinite number of times. 

Theorem 8 obviously bears a close analogy to Julia’s theorems on entire 
functions(®). The analogy can be pursued still more closely as we now indi- 
cate. 

In the z’-plane used in Theorem 8 let C be an arbitrary curve (not neces- 
sarily a Jordan curve) joining the unit circle to the origin. 


C: 2 = o(Z), 0s7s1, 
o(0) = 0, | o(1)| = 1, 


where a(t) is a continuous complex-valued function of the real parameter ?. 
From C is found by rotation about the origin a curve which we denote by 
C(w): 2’ =w-a(t), | w| =1. We shall call a horn the set H(w, €) of points each 
of which lies interior to at least one of the circles having its center in a point 2’ 
on C(w) and of radius e- |s"}. It will be noted that the horn H(w, ¢) is then a 
region, and that each of its boundary points except z’ =0 is on the circum- 
ference of a circle of center 2’ on C(w) and radius e| z’|. But of course the curve 
C and the horn H(w, ¢€) need not lie entirely in the closed region R(z’) 20. 
We now prove a generalization of Theorem 8. 


THEOREM 9. Under the hypothesis of Theorem 8 for arbitrary C there exists 


(*) G. Julia, Lecons sur les Fonctions Uniformes a Point Singulier Essentiel Isolé, Paris, 
1924, p. 105 ff. 


4 

4 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE ’ 207 


a curve C(wo) such that in every horn H(wo, €) the transform of the function f(z) 
takes on every value an infinite number of times, with the exception of at most 
one value. 


Let the numbers ¢ and «& be given, 1>e>¢,>0. Consider all circles y 
and 71 of radii re and re, with variable common center (r, 6), where r is 
bounded and @ is arbitrary. Then the non-euclidean distance from a point of y1 
in the region R(z’) >0 to the nearest point of 7 is bounded from zero, say is 
greater than or equal to some positive 5 independent of r and @. This conclu- 
sion follows from the fact that in studying the non-euclidean distance it is 
no loss of generality to take r=1. 

As an application of this remark, since each boundary point of the horn 
H(w, &), lies on a circle y; with center z’ on C(w) and radius «| 2’| , and. since 
all points interior to the circle y with center 2’ and radius €|z’| belong to 
Hw, €), it follows that the non-euclidean distance from each boundary point 
of H(w, «) in R(z’)>0 to the boundary of H(w, ¢€) is greater than or equal 
to 8. If all points of a set {21,,} in R(2’) >0 lie in Hw, «:), then each point whose 
non-euclidean distance from some Zp, is less than 6 lies in H(w, €). 

Suppose now the points 


= 


are the transforms in the z’-plane of the given g-sequence. Each z, lies on 
some curve C(w,); in fact, the continuous function | o(z) | must take on the 
value r, for some value of t, say tn, 0<t,<1, whence 


= | o(tn) | en, = | o(tn) |, 
so Zz, lies on the curve 


| o(tn) | 
o(tn) 


Of course ¢, and w, need not be uniquely defined, but we choose a specific de- 
termination. 

Let the set w:, ws, --~- on the unit circle have the limit point wo. Then for 
every €,:>0, the horn H(wo, &) has an infinity of the points 2, in its interior. 
For on the arc of |z’| =1 in the circle | z’—wo| =<: lie an infinity of points w,, 
SAY Wn,, Way, -* - . Then of the circle r =r,, the entire arc which lies in the circle 


C(w,): 2’ = Wn 


| o(tny) | = 


lies on H(wo, €:), and this arc of the circle r=r,, contains the point 


Zn, = | o(tn,) | 
by virtue of the inequality for w,, 


\ 
7 


W. SEIDEL AND J. L. WALSH 


| o(tn,) | 
o(tn,) 


We are now in a position to prove Theorem 9. Let C be given. The num- 
ber wo is to be determined as just indicated, and thus C(wo) is defined. Then 
e>0 is arbitrary, and we choose 4, 0<e:<e. The points 2, already defined 
are the transforms of a q-sequence 2,4; it follows from Theorem 7 that in the 
set of circles having the z;, as non-euclidean centers and with a common non- 
euclidean radius the function f,(z’) [transform of f(z) ] takes on every value 
with at most one exception an infinite number of times. The points 2,, lie in 
H(wo, €:), and these non-euclidean circles (chosen with common non-euclidean 
radius less than the number 6 previously defined) all lie in H(wo, €). The proof 
is complete. 

It is also true that in every H(wo, €) in every neighborhood of the origin the 
function f:(z’) takes on every value with at most one exception an infinite 
number of times. 

31. Functions with bounded D,. In studying the relation between D,(w) 
and | f'(2)| (i- |z| 2) we have restricted the class of functions f(z) in such a 
manner that the associated functions ¢,({) should form a normal family. For 
this reason we considered the class of univalent functions, the class of bounded 
functions, and the class of functions omitting two values. There is, however, 
another criterion of normality, which was discovered by Bloch(®). It is the 
class of functions for which the radius of univalence D;(w) is bounded. The 
desired relations may be easily obtained for this class. Indeed, we have 


THEOREM 10. Let w=f(z) be analytic for |z| <1, and let D,(w) be uniformly 
bounded: D,(w) =D. Setting wo =f(z0), where zo is an arbitrary point of <i, 
the inequality 


holds, where K may be taken equal to 20D/B, B being Bloch’s constant. 


We begin by using the method of proof cman ibid.) of Bloch’s theorem 
on normal families. If we set 


— wo| S «1. 


1 — | 2:| 

| | < 1, | 2: | <1, 0, 
we note that g({) is analytic in |¢ | <1, with g’(0) =1. Then it follows from 
Bloch’s theorem (§28, Theorem 5 for p=1) that for the function g(f{) and for 


some w we havé D;(w) 2B, where B is Bloch’s constant; hence if D,(w) refers 
now to the function ¢[z:+(1 —| | )¢] we have for some w 


(*) Cf. P. Montel, ibid., p. 115. 


1 + 22 


208 (July 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 209 


D,(w) 
(1—| ¢'(2:) | 
But by our hypothesis we have D,(w) =D, whence 
| 21 | ) 


this inequality is valid in the case ¢’(2:) =0, exceptional for (31.2). 
If we introduce the notation 


(31.3) #6) = 6) = 
where the integral is taken along a line segment, we have for | ¢| Sp<i 
Ddp D 
| ols fa p) 


The inequality of Theorem 2, §10, can be written in the present case 


| @’(0) |%p? 


D,(wo) 2 , Wo = f(z0). 
— (4D/B) log (1 — p) 
It is seen immediately that the maximum of the function 
2 
0<p<1, 
log (1 — p) 
occurs when 
p 
— log (1 — p) = ———- 
2(1 — p) 


which is approximately p=.72, so we may take 


Ds(we) & ——| — | 
1 0 10D 0 0 


This proves the theorem. 

As a corollary, it is seen that under the hypothesis of Theorem 10 the con- 
dition D,(w,)—0 is a sufficient condition for | (zn)| (i- even when 
w,—. As a further remark it may be observed that the class of functions 
considered in Theorem 10 includes the case that the area of the image of 
|z| <1 under the transformation w =f(2) is finite. 

It is clear that analogous inequalities could be obtained for the higher de- 
rivatives. We proceed instead to the analogous theorem for D,(w) in general: 


= 


210 W. SEIDEL AND J. L.. WALSH [July 


THEOREM 11. Let w=f(z) be analytic for | z| <1, and let D,(w) be uniformly 
bounded: D,(w) SD,, where p is given and D, is independent of w. If we set 
Wo =f(zo), where | z0| <1, we have 


kel | (k — 


D; 1-2” 
249K, ,(—*) [D,(w0)]*”, 


where B, is the constant of the corollary of §28, and where K, is a constant de- 
pending only on p; indeed we may set 


K, = min {p-»[— log (1 — p)}-*”,0< p< 1}, 
or we may set K,=2?. . 


Of course the boundedness of D,(w), as in Theorem 11, is a stronger con- 
dition than the boundedness of D,(w), as in Theorem 10, for we have 
D,(w) 2 D,(w). 

As before, we introduce ¢(z) by the first of equations (31.2), but set now 
G(g) | z1| where z; is arbitrary provided | <1. Thus is 
analytic in lg] <1. Then if D,(wo) refers to G(¢) or to f(z), we have by the 
corollary, §28 


1 1 
D,(w0) = By | #0) + | 
By virtue of the inequality D,(wo) $D,, we may now write 


> (1 | | 
Bs — | 21] ) | |. 


In the notation of (31.3) we have for | g| Sp<i 


D, 


The function (pf) is analytic in |¢| <1 and has there the bound indicated 
by (31.4). By §19, Corollary 3, we may write, 


p* p? 


D, 
s 249| - - [D,(0) }*”, 


and this inequality is valid whether D,(0) refers to (pf) in |g] <1, to &(¢) 


J 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 211 


in. | ¢| <p, or to ®(£) in lg] <1. The first part of Theorem 11 follows at once, 
where D,(w) refers now to f(z), by §2, Lemma 2. The latter part of Theorem 
11 follows from the inequality for p=1/2 


p-?[— log (1 — p)]**” < 22. 


An obvious consequence of Theorem 11 is that under the conditions of 
that theorem D,(w,)—0 implies 


f(2,)(1 —|z,|?)*— 0, k= 1, 2,°° Ps 


where w,=f(z,), <1; this conclusion is valid even if 

32. Comments on condition |z,|—1. In the major part of the present 
paper, so far as it deals with D,(w), we are concerned with a function f(z) 
analytic for |z| <1 and the two conditions 


(32.1) D,(w,) — 0, Wn = f(zn), 
(32.2) — | |?) 0. 

In the present section we propose to study the further condition 

(32.3) 


in its relation to (32.1) and (32.2). To some extent, our remarks will be a re- 
capitulation of material already developed. 

The relation (32.2) implies (32.1) with no further restriction on f(z), as 
follows from §4, Theorem 2. 

For a univalent function f(z), relation (32.1) implies (32.2) by §4, Theo- 
rem 1’. For such a function each of the conditions (32.1) and (32.2) implies 
(32.3), because f’(z) has a positive lower bound in the closed region |s| sr<i; 
but (32.3) does not imply (32.1) or (32.3), as is illustrated by the function 
f(z) =z/(1—z)?, when real z—>1; nevertheless(32.3) combined with the bound- 
edness of w, implies (32.1) and (32.3), as follows by the kind of reasoning 
about to be given. 

However, if f(z) is both univalent and bounded, each of the conditions 
(32.1), (32.2), (32.3) implies all those conditions; it is sufficient now to show 
that (32.3) implies (32.1). The plane region R which is the image of | z| <1 
under the map w=f(z) can be considered the sum of the plane regions R,, 
the respective images of |z| <1—1/», y=1, 2,---, under the map w=f(sz). 
The regions R, increase monotonically; given an arbitrary 5>0, there exists 
an index NV; such that every point of Ry, lies within a distance less than 6 of 
the boundary of R; the inequality |z| >1—1/N, implies D,(w) < 8; thus (32.3) 
implies (32.1) and hence (32.2). 

Let now f(z) be bounded in |z| <1; we have already indicated (§10) that 
the conditions (32.1) and (32.2) are equivalent. Nevertheless it is obvious 
that (32.1) does not imply (32.3); whenever z, approaches a point zo with 
| zo| <1, f’(zo) =0, the relation (32.1) is satisfied without (32.3): nevertheless, 


212 W. SEIDEL AND J. L. WALSH [July 


if D;(w,)—0, there exists a subsequence of the z, which approaches a point 2o, 
with either f’(zo) =0, | <1, or =1. Reciprocally, Szegé’s example (in- 
troduction to Chapter II) shows that (32.3) may be satisfied without (32.1). 

Let us suppose now f(z) bounded in <1, |f(z)| <M, 1, wn =f(zn) 
—wo, D:(w,) 25>0; we shall derive some geometric properties of the Rie- 
mann configuration R onto which the transformation w =f(z) maps |z| <1. By 
inequality (4.4) we may write also 


(32.4) | | (1 — | |?) 6. 
Let r be arbitrary, 0<r<1. The function 


is analytic in |¢ | <r, has a modulus there not greater than M, with 
|p’(0)| =|f’(zn)| (1— *) 2 6. It follows from the Landau-Dieudonné theo- 
rem (§10) that the image of | ¢| <r under the transformation w =f(z) contains 
a smooth circle whose center is w, and whose radius is at least r?6?/8M = 6. 
By virtue of the relation |z,|—+1, it is possible to choose a subsequence Zn, 
having the property that the circle whose non-euclidean center is z,, and non- 
euclidean radius 2 log [(1+7)/(1—r)] contains on or within it none of the 
points Z,,,;, 7 >0; as a consequence it follows from the triangle inequality that 
the circles y,, whose fion-euclidean centers are the points z,, having the com- 
mon non-euclidean radius log [(1+-17)/(1—r) ] are mutually exterior; this circle 
Yn, is the image of |{| =r under the transformation 2=({+2n,)/(1+4n,¢)- 
Then the closed interiors of the smooth circles C,, on R whose centers are the re- 
spective points Wp, having the common radius 5, are mutually disjoint. By virtue 
of our assumption w,—wp, it appears that the configuration R has an infinity 
of separate sheets over the point w=wo, each sheet containing a circle of 
center wo and radius 6:—7, where 7 is arbitrary. We shall prove 


THEOREM 12. Let the function f(z) ana]ytic and bounded in |z| <1 admit a 
sequence 2, with |z,| <1, |z,|—1, 


(32.5) D;(w,) 2 6 > 0, Wn = f(Zn). 


Then there exists a value w = wp» such that the Riemann configuration R onto which 
the transformation w=f(z) maps | z| <1 has an infinity of separate sheets over 
the point w =wo, each sheet containing a smooth circle whose center lies over the 
point w=wo and whose radius is 5,>0, where 5: is suitably chosen. 


In Theorem 12, the condition (32.5) may of course be replaced by the con- 
dition that |f’(z,)| (1—|z,| 2) should be bounded from zero, a condition that 
implies (32.5). 

To prove Theorem 12 it suffices to apply the reasoning already given to a 
subsequence of the w, possessing a limit. Of course it is not possible to assert 


a 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 7 “3 


here that the original sequence of circles of radii D,(w,) corresponds to sepa- 
rate sheets of R; if the circles of radii D:(we,) are given arbitrarily, correspond- 
ing to separate sheets of R, the point 22,4: can be chosen so near 2g, that the 
corresponding circles overlap, while an inequality of form (32.5) persists. 

Conversely, let f(z) now be analytic and bounded for | z| <1, and let 
w=f(z) map | z| <1 onto a Riemann configuration which has an infinity of 
separate sheets over some point wo, each sheet containing a smooth circle 7, 
whose center lies over the point w = wp» and whose radius is 5: >0; it is obvious 
that the centers of these smooth circles can be chosen as points w, so that the 
relation D,(w,) 2 4¢ is fulfilled. The relation | z.| —1 follows because otherwise 
a subsequence z,, has a limit point zo, with |zo| <1; we have wo=f(z,,), hence 
wo =f(z0); an infinity of points z,, lie in an arbitrary neighborhood of zo; an 
infinity of the points w,,=f(z,,) on R lie on R in each C, whose center is 
Wo =f(z0), where p—1 is the order of zo as a zero of f(z) ; this is in contradiction 
to our hypothesis that the y, lie in distinct sheets of R; the converse of Theo- 
rem 12 is established. 

In Theorem 12 and its converse, we have supposed f(z) to be bounded; it 
also sufficient if f(z) has two exceptional values in | s| <1; compare §22. 

We add one further remark, in a somewhat different order of ideas. Let 
w =f(z) be analytic in | z| <1, and let us suppose 


(32.6) lim sup Di(w) < 
21 


this condition is a consequence of 
(32.7) lim sup | f’(z) | (1 —|2|*) < @, 


if (32.7) itself is valid. It follows from (32.6) that D,(w) is uniformly bounded 
in |z| <1. Hence (32.1) and (32.2) are equivalent. Moreover, the discussion 
of Theorem 12 and its converse applies here. But even under these circum- 
stances it is not true that (32.3) implies (32.1) or (32.2); this is shown by the 
function w=f(z) with f(0) =0, f’(0)>0, which maps |z| <1 onto the strip 
: | o| <a, where w=u-+iv; we have Di(w) Sa. But when 2z, is positive, z,—1, 
we have D,(w,) =7, so neither (32.1) nor (32.2) is satisfied. 
33. p-valent functions. For p-valent functions we can obtain results analo- 
gous to those for bounded functions and for functions which have two excep- 
tional values. 


THEOREM 13. Let the function f(z) be analytic and p-valent in the region 
|z| <1. Then we have 


1 
(33.1) | 7 + < A,-D,(0), 


where A, is a numerical constant depending only on p. 


214 W. SEIDEL AND J. L. WALSH [July 


We assume f(0) =0, which obviously involves no loss of generality. We 
write for reference the inequality 


uy = max [| | | S| +| +--- +] 
1 
(33.2) 


A theorem due to M. L. Cartwright(™) asserts that under the conditions 
of Theorem 1, since we have f(0) =0, we have 


(33.3) | f(z) | As — ls) sr<1, 


where A, is a number depending only on p and where y, is defined by (33.2). 
We shall use (33.3) for the particular value r=1/2: 


(33.4) | f(z) | | < 1/2. 


The function F(z) =#f(z/2) is analytic in the region | s| <1 and has there 
the bound 277-A, -y,. If D,(0) refers to F(z) or to f(z) we have by §19, 
Corollary 3, 


1 1 
[D,(0) 
where B, may be chosen as 24p. The first member of (33.5) can be written 
1 
22-2! 


2 2°. p! 


which is not greater than 
1 “2 1 
A consequence of (33.5) and (33.2) is then the inequality 
1 1 ne 


S 2°. B,[2*”-A, [D,(0)]*”, 
which can be put into the form (33.1). 


By virtue of §2, Lemma 2 and §20, Theorem 2, we can formulate from 
Theorem 1 


THEOREM 14. Let the function f(s) be analytic and p-valent in the region 
|z| <1. Then with the conditions |o| <1, wo=f(z0), we have 


(7°) Mathematische Annalen, vol. 111 (1935), pp. 98-118. 


‘ 


1942] FUNCTIONS ANALYTIC IN THE UNIT CIRCLE 215 


where 7, is the number of §20, Theorem 2, and ©, is a number depending only 

on p which may be chosen as A, in Theorem 1. 
Consequently if we have | z,| <1, w,=f(2,), @ necessary and sufficient condi- 
tion for 


lim f(z,)(1 — | |*)* = 0, k=1,2,---,, 


is the condition lim,.. D,(w,) =0. 


The case p =1 brings us back to §4, Theorem 3. 

34. Some extensions to meromorphic functions. Let us cnntidias a class of 
functions f(z) meromorphic in | z| <1, omitting there the three distinct values 
a, b, c, and such that f(0) =A, A| Ao, where Ao is a positive constant inde- 
pendent of the particular function of the class. Corresponding to this class 
there exists a number 6 (0<@<1) such that we have 


(34.1) \f(z) | Q(Ao, 6) 


for |z| <0, where Q is independent of any particular function of the class. 

Indeed, suppose no such value of @ existed. On the circle | z| =1/n some 
function f,(z) would attain a value of modulus exceeding %. From the se- 
quence of functions f,(z) one can extract a subsequence converging uni- 
formly(”) in every closed subregion of |z| <1 either to a meromorphic 
function or to the infinite constant. The second alternative cannot take 
place since by hypothesis | fn(0)| Ao for all ». But, on the other hand, if 
the sequence f,(z) converges to a meromorphic function, the latter must have 
a pole at the origin which is not possible on account of the condition 
|fn(0)| Ao. Hence, the asserted existence of @ has been established. 

Let zo (| 20] <1) be a point such that, setting wo =f(20), we have | wo| SAo. 


Consider the function 
o + 2 ) 
o(5) s( 1 + 


which is meromorphic in | {| <1, omits there the values a, b, c, and for which 
(0) =wo. In accordance with (34.1) we have 


| | 6) 
in |¢| <6. Hence, in |¢| <1 we have 
| $(65) | 6). 


(”) Defined, for instance, as by Montel, Lecons sur les Familles Normales de Fonctions 
Analytiques, Paris, 1927, p. 124. 


} 
1 
n— 
| 
| 
| | 
a 
| 
| 


216 W. SEIDEL AND J. L. WALSH 


Now, applying Theorem 5, Chapter III, we obtain the inequality 
(34.2) + | Ay [Dp(wo)]*”, 


where A, depends on , 0, and Apo. It is clear, furthermore, that 0 depends 
on a, b, c, Ao but not on ¢(f) and consequently may be omitted by modifying 
A, properly. It is also to be noted that D,(wo) in (34.2) is the radius of p-va- 
lence at the point wo of the Riemann surface on which the function $(6¢) 
maps |{| <1 which is the same as the radius of p-valence at the point wo of 
the Riemann surface on which ¢(¢) maps le <0. This radius of p-valence is 
not greater than the radius of p-valence at the point wo of the Riemann sur- 
face on which $(¢) maps \¢ | <1. Hence, if in (34.2) we return to the function 
f(z) we obtain the inequality 
k-1 2) k—v f( k—v) 

kent | (k — v)! 
where D,(wo) is now the radius of p-valence at the point wo of the Riemann 
surface on which f(z) maps the circle |z| <1. 

Now, as is remarked in §21 after the proof of Theorem 2, that theorem 
requires analyticity only in the neighborhood of the origin, which f(z) pos- 
sesses in |z| <0. Hence, applying Theorem 2 we find that. ' 
» (1 — | (a0) 


Pp. | 
(34.4) (we) S (— 1) — 


| (k— »)! 
where \, depends on p alone. Thus, we may state 


THEOREM 15. Let f(z) be a function meromorphic in |z| <1, omitting there 
the three distinct values a, b, c. Let 2 (| z0| <1) be a point such that, setting 
wo =f(z0), | wo] SAo where Ao is a positive constant. Then, the inequalities (34.3) 
and (34.4) hold, where Aj depends on Ao, a, b, c, but not on 2» or f(z). 


It follows that if under the hypotheses of Theorem 15 for a sequence of 
points z, (|z,| <1) the sequence w, =f(z,) is bounded, then a necessary and 
sufficient condition for lim,... f (zn) (1—|2,|2)*=0 (k=1, 2,---, p) is 
lim, ..D,(w,) =0. 

It will be noted that under the conditions of Theorem 15 we have 
D,(w)s |w—a| so that inequality (34.3) gives an inequality on the approach 
to zero of (1—|2|*)*|f(z)| as w tends to zero, for every k. 

We add the remark that much of the discussion of §27 can be carried 
over to meromorphic functions which omit three values; this development is 
left to the reader. 


Pp 


UNIVERSITY OF ROCHESTER, 
RocHEsTER, N. Y. 

Harvarp UNIVERSITY, 
CAMBRIDGE, Mass. 


‘ 
14 t 
t 


ANALYTIC EXTENSION BY HAUSDORFF METHODS 


BY 
RALPH PALMER AGNEW 


1. Introduction. Let x(#) be a complex-valued function having bounded 
variation over 0S#31, let x(0) =0, and let x(¢) have no removable discon- 


tinuities. This mass function x(t) generates a manera transformation (Haus- 
dorff [9]) H(x) 


f ¥ — 
0 


by means of which a series with partial sums So, 51, Se, - 
is summable H(x) to as The transformation H(x) is regular 
(that is, such that existence of lim s, implies lim ¢,=lim s,) if and only if 
x(1) =1 and x(#)-30 as t—0. 

Among the familiar regular transformations obtained by specializing x(t) 
are the methods C, (R(r) >0) of Cesaro for which x(t) = 1—(1—#)"; the meth- 
ods H, (R(r) >0) of Hélder for which 


1 t 1 r—1 


and the methods Z, (0<r31) of Euler for which x(¢) =0 or 1 according as 
O0st<rorrsisi. 

It is well known that neither Cesaro nor Hélder methods are effective in 
evaluating power series outside their circles of convergence. Characterization 
of the region of the complex plane in which a power series }_a,2" is summable 
E, was given by Knopp [14] and Rademacher [18] for the cases in which r 
is of the form 2-*, k=1, 2, 3,---+. If 29 is an interior point of the Borel 
polygon determined by then is summable E, provided r lies 
within a sufficiently small interval 0<r <6 to the right of the origin. On the 
other hand, if 2: is a point outside the Borel polygon, then no regular Euler 
method can evaluate ) 4,2}. Beyond these facts and a few corollaries of them, 
very little seems to be known about the problem of analytic extension by 

means of Hausdorff methods of summability. It has been conjectured by 
- Garabedian and Wall [8] that a regular Hausdorff method is ineffective out- 
side the circle of convergence unless x(¢) is constant over some interval 
1—6<t31. For references to literature (up to 1927) on analytic extension 
by various methods, see Hille [10]. 


Presented to the Society, September 5, 1941; received by the editors July 14, 1941. 
217 


“USTON UNIVERS; 
JLLEGE OF Liper 


i] 
| 
| 
| 
| 
| 
| 
| 
+] 
J 
{ 
4 
4 
q 
{ 


218 R. P. AGNEW , [September 


Long ago it became the fashion to test the efficacy of methods of summa- 
bility by applying them to the geometric series }.2". We determine, in terms 
of the generating function x(¢), the open region of the z-plane in which }2* 
is summable by a regular Hausdorff method H(x). The result is set forth in 
the following theorem. 


THEOREM 1.1. Let x(t) have bounded variation over 0StS1, let x(0+) 
=x(0) =0, let x(1) =1, and let x(t) have no removable discontinuities. Let r be 
the greatest lower bound of all numbers p such that x(t) =1 when pStsS1. Then 
dz" is summable H(x) to 1/(1—2) at each point z interior to the circle 


and >°2" is non-summable H(x) at each point z exterior to the same circle. 


Let the number r described in the statement of the theorem be termed the 
order of the transformation H(x), and let the circle be termed the circle of 
summability. If (as is the case for the Cesaro and Hélder methods) there is 
no interval p<t3S1 over which x(¢) is constant, then the order r of H(x) is 1 
and the circle of summability is identical with the circle of convergence. If 
(as is the case for the Euler methods of order 0<r<1) there is an interval 
p<tsS1 over which x(t) is constant, then the order r of H(x) is less than 1. 
The order must be positive since x(¢)—+0 as +0. Thus the radius r~ of the 
of the circle of summability is greater than 1, the circle of summability being 
tangent to the line R(z) =1 at the point z=1. Since the center and radius of 
the circle of summability depend only upon the order of H(x), it follows that 
two Hausdorff methods of equal order have the same circle of summability. In 
particular, the circle of summability of a Hausdorff method of order r is the 
same as that of the Euler method E, of equal order(*). Differences in effective- 
ness of two Hausdorff methods of equal order can appear only for points z 
on the circle of summability. It is a consequence of Theorem 1.1 that the effec- 
tiveness of methods H(x) in evaluating }-z" increases steadily as the order r 
decreases. Each fixed point 2 for which Rao < 1 lies within the circle of summa- 
bility of H(x) provided the order r of H(x) is sufficiently near 0. Each fixed 
point 2, for which 2:41 and Rz:21 lies outside each circle of summability 
and accordingly }.2% is non-summable H(x) for each regular H(x). 

Proof of summability of }>z" when z lies inside the circle of summability 
is very simple and is given in §2. Proof of non-summability is more compli- 
cated; this and related facts are proved in §3. In §§4 and 5, we establish 
uniform summability of power series over appropriate sets. In §§6 and 7, we 


(*) A regular method H(x) includes E, if and only if the order of H(x) is less than or equal 
to r. See Hille and Tamarkin [11]. For a general discussion of inclusion relations involving regu- 
lar Hausdorff methods, see Garabedian, Hille, and Wall [7]. 


1942] ANALYTIC EXTENSION 219 


define and discuss collective Hausdorff summability. In §8, we improve a known 
theorem relating Abel and Hausdorff summability. 
2. Proof of summability. When +1, the series } sz” has partial sums 


$s, = (1 — z*+)/(1 — 2), k=0,1,2,-- 
and the H(x) transform of the series >" is accordingly 


1 
on(2) = Do Cn,n(t2)*(1 — #)"-*dx(2) 


1—2Jo ino 


(2.1) 


1 
In(2) 
—z 
where 


(2.2) fae) = f [1 + 


The sequence f,(z) is the H(x) transform of the sequence s;=2*. 

Let x(t) be regular and let 2» be a fixed point inside the circle of summa- 
bility; we show that is summable to 1/(1—2) by showing that f,(z0) 0 
as n— ©, Since Zo lies inside the circle of summability, 


(2.3) |1+%(zo —1)| <1 


where r is the order of H(x) and accordingly x(¢) =1 when r<tS1. We may 
assume that x(¢) =1 over Let e>0. Then 


| fa(20) | = | [1 + — 1) | 


é r 
(2.4) sf + | 1 + — 1) |*| dx | 
0 


0 
when 1 is sufficiently great, provided 6 is a positive number so chosen that 
(2.5) 
0 


and A is a constant for which 


(2.6) ma 


x |1+4(2—1)| S54 <1. 
éstsr 


Hence f,(z0)—>0 and our result follows. In case F is a closed set interior to 
the circle of summability, it is possible to fix 5>0 such that (2.5) holds and 


i] 
| 
| 
1 
| 
| 
} 
{ 
f 
| 
i 
| 
i 
4 


220 R. P. AGNEW : [September 


then show existence of a constant A such that (2.6) holds for each 2€ F; 
this implies uniform summability over the set F. 

3. Proof of non-summability. Assuming that x(¢) satisfies the hypotheses 
of Theorem 1.1 and that z is a fixed point outside the circle of summability, 
we show that }.z* is not summable H(x). For this, it is sufficient to show that 
the H(x) transform f,(z) of the sequence 2* is unbounded. 

The hypothesis that z lies outside the circle of summability implies that 
|z| >1. Hence there is a number ¢; such that 0<#,<1 and 


11+ (2—1)t| <1, 
>1, h<tsi1. 


In case Rz21, t; is 0; and in case Rz<1, we have 0<t,<1. Again using the 
hypothesis that z lies outside the circle of summability, we obtain 


where r is the order of H(x), and accordingly r >t. Hence our result is a con- 
sequence of the following theorem. 


THEOREM 3.1. If |z| >1 and x(t) is a function of bounded variation over 
0StS1 having no removable discontinuities, then a necessary and sufficient con- 
dition that the H(x) transform 


= [1 + — 


of the sequence s,=2* be bounded is that x(t) be constant over the interval t; <<tS1 
over which | 1+(z-— >1(?). 


Sufficiency is obvious; for if x(#) is constant over 4:<¢31, then we may 
assume x(¢) constant over and obtain 


1 
| fala) | sf [1+ |"| dx(o| 


To prove necessity, we simplify writing by setting {=z—1, and assume 
that the sequence f, defined by 


1 


(?) In case H(x) is an Euler method E, for which 0<r <1, fa(z) = [1+(s—1)r]* and we see 
immediately that f,(s) is bounded if and only if | 1-+(s—1)r| $1 and hence if and only if 0<rsh 
and therefore if and only if x(¢) =1 over t¢<¢$1. The case in which H(x) is not an Euler method 
is not so simple. 


i 
‘PA 
: 


1942] _ ANALYTIC EXTENSION 


is bounded. Integrating by parts, setting 


=— f dx(s) = x() — x(1), 


we obtain 

1 1 
3.11 d 1 dt. 


If we show that »,(¢) =0 over t;<¢31, our result will follow. The function 
%(t) has bounded variation over 0S¢S1 and has no removable discontinui- 
ties. If we set 


1 
(3.12) = f+ 
0 
then (3.11) and our hypotheses imply that I{0 as n— ©. Integrating by 
parts, setting 


we obtain 


(1) 
I, 


1 1 
-f — ng f (1 + 
0 0 


If we show that 2.(¢) =0 over ¢t;<¢S1, our result will follow. The function 
02(t) is continuous and if we set 


f + £4) "v9(t)dt 
P 0 


‘then 20 as n—. If we integrate by parts once more, setting 


gt) = — f ve(u)du, 


t 
we see that our result is a consequence of the following lemma. 


Lemma 3.2. If g(t) has a continuous derivative, if |1+ | >1, and if 
1 


is bounded, then g(t)=0 over the interval t,<t31 of values of t for which 
|1+¢¢| >1. 


Since I, is bounded, the series }-J,w" converges and defines a function 


\ 
q 
| 
221 
| 
i] 
| 
f 
_ 
| 
i 
il 


R. P. AGNEW ‘ [September 


F,(w) = > I,w" 


analytic over the open circular region | w| <1 of a complex w-plane. If 
| w| <1/|1+¢], then 


| w(1 + sé) | S| <1, 
so that 


Dd (1 + "wg(t) 


n=O 


converges uniformly over 03/31. Hence it follows from (3.21) that when 


| w| <1/|1+¢], 
n=O 0 


1— w— 
But the function F;(w) defined by 
g(t) 
F = f —————- dt 
(w) o l1—w-— wt 


is analytic at each point w of the complex plane except possibly those for 
which the equation ~ 
1 
1 + 


holds for some ¢ in the interval 0 $# 31. 
Let to be fixed such that 0<toS1 and |1+¢to| >1. If we set, for y real, 


1 
1+ + iy) - 


Wy 


1 + ¢ _ — t + ty) 


1— =1- = 
wel 1+ f(to+ iy) 1+ + iy) 


and hence when y+¥0 


But Fi(w) and F,(w) are functions which are analytic and equal at points 
w =w, for which y is real, not 0, and | y| is sufficiently small. Moreover F,(w) 
is analytic at the point w=w»=1/(1+{to). Therefore lim,.o F2(w,) must exist. 
Hence if we set 


222 
n=0 
t 
dt. 
| 
then 
a 4 


ANALYTIC EXTENSION 


then lim,.o G(y) must exist. We are now in a position to complete the proof by 
showing that g(to) must be 0. 

Since g(#) has a continuous derivative, there is a continuous function B(¢) 
such that 


g(t) = B(t)(t — to) + g(to). 
Hence G(y) = Gi(y) +Ge(y) where 


Since the first integrand is dominated by | B(t)| and converges to B(t) as 
y—0, lim,.o Gi(y) exists; hence lim,.o Ge(y) must exist. Upon evaluating the 
integral for Ge(y), we see that existence of lim,.o Ge(y) implies that g(to) =0. 
This completes the proof of Lemma 3.2 and hence also the proofs of Theo- 
rems 3.1 and 1.1. 

4. Uniform summability of power series inside Euler polygons B(r). Let 
>-cn8" be a power series with a finite positive radius of convergence R, and 
let f(z) be the function generated by analytic extension along radial lines from 
the origin. The open set in which f(z) is thus defined is the Mittag Leffler star S. 
If a half-line /, of points z, such that z=pe‘* where p20, contains no singular 
point of f(z), then J, is in S. If 1, contains a singular point, and po is the least 
value of p for which pe‘* is a singular point, then the point ¢ =poe** is a vertex 
of the star; the points of /, for which 0 Sp <po lie in S and the points for which 
p =po are exterior to S. 

Let r be fixed such that 0<r<1. Corresponding to each vertex ¢ of the 
star, let B(r, ¢) denote the set of points z for which 


He 


r 


(4.01) 


This set B(r, £) is the interior of the circle, with center at (1—1r~—')f, which 
passes through the point ¢. The set B(r, £) contains the interior of the circle 
of convergence. Let B(r) denote the set of inner points of the intersection of 
the sets B(r, £) determined by the set of vertices ¢ of S. This set B(r), which is 
not a polygon in the ordinary sense, was called a curvilinear polygon by 
Knopp [14]; we shall call it the Euler polygon of order r. For each r, B(r) isa 
bounded convex open set containing the inner points of the circle of conver- 
gence. If r1<172, then B(re, C B(n, for each and accordingly B(r2) C B(n). 
The union, for 0<r <1 of the sets B(r) is (Knopp [14] and Rademacher [18]) 
the Borel polygon B. It was shown by Knopp and Rademacher, for the case 


224 R. P. AGNEW : [September 


in which r=2-?, that c,z* is summable EZ, or non-summable E, according 
as z lies inside or outside the Euler polygon B, of order r. The following theo-’ 
rem presents a fact, involving summability by regular Hausdorff methods, 
which is the analogue of the fundamental fact that a power series having a 
finite radius of convergence converges uniformly over each closed set F inside 
the circle of convergence. 


THEOREM 4.1. If H(x) is a regular Hausdorff transformation of order r and F 
is a closed subset of the Euler polygon B(r) of order r of a power series > cz" 
having a finite positive radius of convergence, then > .c,2" is summable H(x) uni- 
formly over F to the function f(z) obtained by analytic extension of > cxz" along 
radial lines from the origin. 


Let B(r).Then, when ¢ is a vertex of the star S, (1 
Let {’ be a point not in S. Then a vertex £ of S and a number p21 exist such 
that {’=pf. The circular set of points z for which |s—(1—1-)¢’| <r-|¢’| 
contains the circular set of points s for which |z—(1—1-){| <r-|¢| and 
hence contains 2. This shows that if 2,€B(r) and ¢’ is not in the star, then 
| <r | . It follows that if 2:€B(r), then the set of all points 
u for which 


(4.11) 


must lie in the star. The set of points u for which (4.11) holds is the set for 
which 


| 21| 
(4.12) 


Therefore, if z:€B(r), the circular set of points u satisfying (4.11) and (4.12) 
must lie in the star S. This circular set contains the origin in its interior, and 
the point 2; lies on the boundary. 

Let p(¢) be the continuous positive function such that the point w=p(¢)e** 
traverses the boundary of B(r) as ¢ increases from 0 to 27. Corresponding to 
each multiplier 4 for which 0</A<1, let B(r, 4) be the open set of points in- 
side the curve w=hp(¢)e**. The sets B(r, h) are nested subsets of B(r) in the 
sense that if 0<hi</2<1, then A(r, 1) CB(r, 42) CB(r). Since F is a closed 
subset of the open set B(r), there is a multiplier ho such that 0<ho<1 and 
FCB(r, ho). Let hy be fixed such that 4o<y<1. Since H(x) is regular, the 
H(x) transform of }c,2" converges to f(0) when s =0. Accordingly, it is suffi- 
cient to prove that the H(x) transform of }\c,z" converges to f(z) uniformly 
over the set 8 obtained by deleting the point z=0 from the set B(r, ho). 

Let z€ Bo. Then there is a number \=X(z) >1 such that the point dz lies 
on the boundary of the set B(r, 41); we shall want to use the fact that A>Yo 
where \o =//ho. Let C(z) denote the circle composed of the points u for which 


4 

4 


1942] ANALYTIC EXTENSION 

(4.21) [xs — (1 — = 

Since A4z€B(r), the circle C(z) lies in the star. The equation of C(z) can be 
written in the form 


(4.22) ds 


From this equation, it is apparent that the origin and the point z lie inside the 
circle C(z); in fact 0 and z are points of the diameter having its ends at the 
points — [r/(2—r) ]Az and dz. Since 0<r/(2—1r) <1<X, it is apparent that if 
u€C(z), then 


| | az | 
= 


(4.23) | «| > —_ = >0 


where c; is the minimum value of {Az| when iz lies on the boundary of the set 
B(r, mm) and accordingly t:=hR where R is the radius of convergence of 
C,2"; and where C2 is defined by the last equality. Since on the one hand 
dz—2z| 2cs>0 where cs is the minimum value of | 2:—22| when 2 lies on the 
boundary of the set 8(r, 41) and 2 lies in the closure of the set B(r, ho), and on 
the other hand 


— [—1/(2 — =| [r/(2 — oe, 
it follows that when C(z) 
(4.24) lu—s|2a>0, 
the constant c, being the minimum value of cs and cz. 
When and u€ C(z), the 
(4:31) ~(1-— 


If 441 Spo, then the set of points v of the complex ihe for which the inequality 
(4.32) w(t Sur 


holds when 4 =: is a subset of the set of points v for which the inequality 
holds when y=: Hence we can use (4.31) and the fact that X\=X(z) So to 


obtain 
1 (1 ~) 1 
u Xo r 
When uy has the fixed value 4=\,'<1, the set of points v for which (4.32) 


holds is a closed subset of the set of points v for which 
<r. 


\ 
| 


226 R. P. AGNEW oa [September 


Hence there is a constant 0, depending only on r and Xo and independent of z 
and u, such that 0<@<1 and 


1 0 
(4.33) -(1--) 

r r 
If 5 is a fixed number for which 0 < 6 <r, then (4.33) implies existence of a and 
¢ such that 0<a3S1,0<@S27 and 


and it follows easily that, when <r, 


t afd 1— 
1+ e*|s1-2 
u r 
(4.34) 


where 6; is a constant, defined by the equality, depending on 4, 0, and r but 
independent of z and u. These considerations show also that 


(4.35) 1+4(=-1) 1, OsStsr. 
u 


For each h in the interval 0<A<1, let W(h) denote the set which con- 
tains those and only those points w for which the inequality 


(4.41) 2r°| w| 


holds for at least one point 2’ in the closure of the set B(r, h) defined above. 
For each h, W(h) is [see (4.11) ] the union of sets in the star and accordingly 
W(h) is in the star. It is probably true that W(h) is closed, but we do not 
need the result. For each h, the set W(h) is bounded. If 0</;</.<1, then 
the points of the closure of W(M) are inner points of W(hz) and it follows 
that the closure W(h) of W(M) lies in the star. Since W(h;) is a bounded 
closed subset of the star, f(z) must be bounded over W(h;). Choose a con- 
stant cs such that 


| f(z) | Wh). 


From the definitions of C(z) and W(M), it follows that the points on the 
curve C(z) lie in the set W(h;) for each z€ fo. Hence, when z€o, 


(4.42) | (us) | S cs, u C(2), 


the constant cs being independent of z and u. 


1 af 
—_ = 1-—)+ =e 
r r 


1942] ANALYTIC EXTENSION 227 


For each 2€ Bo, let C(z) be the circle defined above. The coefficients c; 
are related to f(z) by the familiar formula 


1 u 
(4.51) — j = 0,1,2,---; 
2ri C(2) uitt 


and the fact that, C(z) surrounds both the origin and the point z implies that 
the partial sums of }°c,z/ are given by 


U— 


“(u — 2) 


(4.52) 


The H(x) transform of }\c;z/ is accordingly given by 
(4.53) On(2) = f(z) — 


where 


1 n k 
R,(2) = f dx(t) (1 — 


— u(u — 2) 


Hence we may complete the proof of Theorem 4.1 by showing that R,(z) 
converges to 0 uniformly over Bo. Since x(#) is constant over r<#<1, we may 
assume that x(#) =1 over r S$t<1-and replace the upper limit of integration 
by r. We then obtain 


C(2) | u| | | 
If cs denotes the least upper bound of |z| for z€ Bo, then 


|u||u—s| 
where c; is the constant, independent of z and u, defined by the last equality. 
Let cs be the least upper bound of the circumferences of the circles C(z); 
it is obviously finite since the circles lie in the bounded set W(h). Let 
C9 =C7Cg. With the aid of (4.34) and (4.35) we obtain, when 0<5<r 


r 
(4.6) [ROlsaf ldOltaf 
0 8 


Let e>0, and fix 6 such that 0<6<r and the first term on the right is less 
than €/2. Then choore N such that the last term is less than ¢/2 when n2 N. 


k+l 


228 R. P. AGNEW : [September 


We then have | R,(z)| <e when s€ fo and n2N. This completes the proof 
of Theorem 4.1. 

Throughout this section, we have considered power series ) c,2" for 
which the radius of convergence R is finite. In case R= © the series is uni- 
formly summable, to the entire function f(z) determined by the series, over 
each bounded set E by each regular transformation of the form 


(4.7) o(t) = 
k=O 


and hence in particular by each regular transformation of the form H(x). 
To prove this, we note first that if s,(z), »=0, 1, 2,---+, is the sequence of 
partial sums of }>c,z", then the sequence converges uniformly over E to 
f(z) and the sequence is uniformly bounded over EZ. Hence (see Agnew [1, 
Theorem 7.21], and Agnew [2]) the sequence s,(z) and the series }.c,2” must 
be uniformly summable to f(z) over EZ. 

5. Other methods of summability. Let }°c,z" be a power series having a 
finite positive radius of convergence, and let F be a closed set interior to 
the Borel polygon B. Then r exists such that 0<r31 and FCB(r), and, by 
Theorem 4.1, }\c,z" is uniformly summable E, over F to f(z). Let o{(z), 
n=0,1, +++, denote the Z, transform of Dens". Since f(z) and the functions 
o((z), of(z),-++ are each bounded over F, it follows that the sequence 
is uniformly bounded over F. 

Let G be a method of summability of the form (4.7) which includes £,. 
For example, G may be the Borel exponential method (Hurwitz [12, p. 27]) 
or the LeRoy method (Morse [15, p. 281]). Let G be such that, for each ¢, 


> a,(t)z* 


converges for all complex values of z. This requirement, which is automati- 
cally satisfied when G is of finite reference, is also satisfied when G is either 
the Borel exponential method or the LeRoy method. Since the inverse of E, 
is Ey, (Hurwitz [12]), the’G transform of s,(z) may be written 


k k—p 
rP r 


k=O 


Our hypotheses imply that, for each #, the series in the right member of (5.1) 
is absolutely convergent. Hence (5.1) can be written in the form 


p=0 L r 


where G(t) =o(t, 2) and 5, =0%)(z). The hypothesis that G includes Ey im- 


1942] ANALYTIC EXTENSION 229 


plies that (5.2) is a regular transformation of the form (4.7). Hence it fol- 
lows, as at the end of the last section, that o(¢, z) must converge to f(z) 
uniformly over F, that i ee is summable G to f(z) uniformly over F. In 
particular, a power series )_c,2" is summable by the Borel exponential method 
uniformly over each bounded closed set inside the Borel polygon, This particular 
result may be known, but the author is unable to give a reference. Borel [4] 
and Phragmén [16] have shown that }\c,2" is summable, by the exponential 
method, at each point inside the Borel polygon and non-summable at each 
point outside the polygon; and Doetsch [5] has discussed summability on 
the boundary of the polygon. 

6. Collective Hausdorff summability 3c. It was shown by Hurwitz and 
Silverman [13] that a transformation of the form 


(6.11) On = 

commutes with the C; transformation 
(6.12) 


if and only if it has the form 


k=0 jmk 
where Xo, A1,°*- is a sequence of complex constants. It was shown by 
Hausdorff [9] that such a transformation is regular if and only if the se- 
quence Xo, Ai, is the moment sequence 


1 
(6.14) f t"dx(t) 
0 


of a function x(#), having bounded variation over 0StS1, such that 
x(0+) =x(0) =0 and x(1) =1. When (6.14) holds, the transformation (6.13) 
takes the form H(x). As was shown both by Hurwitz and Silverman and by 
Hausdorff, these regular transformations commute with each other as well 
as with C, and hence constitute a system of consistent methods of summa- 
bility. By this we mean that if a series or sequence is summable by two differ- 
ent regular methods of the form H(x), then the two values assigned must be 
equal. 

This circumstance makes possible the following definition of a method 3 
of summability which makes use of the collection of regular methods H(x) 
and which may be called the collective Hausdorff method. Let a series )\u, 
be called summable X to the value a if it is summable to the value o by at 
least one regular method H(x). . 


; 
| 
i 
° 
. 
i 
J 


230 R. P. AGNEW : [September 


The method &X is obviously regular. It is also linear; by this we mean 
that if two series )>u, and )-v, with partial sums s, and ¢, are summable 3 
to U and VV, respectively, and if a and are constants, then fon) 
is summable X to aU+8V. This follows from Hurwitz and Silverman [13, 
Theorem 3]. 

It is clear from Theorem 1.1 that the method & is stronger than any one 
method of the form H(x); if H(x) is regular, then }>z* is summable H(x) 
only inside and perhaps at some of the points on the boundary of the appro- 
priate circle of summability, but }>z* is summable 3¢ for each z in the half- 
plane Rz<1. It would be interesting to know whether there is a regular 
method of summability, based on a single sequence-to-sequence or sequence- 
to-function transformation of the familiar type, which includes 3¢; perhaps 
there is one which is equivalent to #. The Borel exponential method B, 
evaluates >>z" when R(z) <1; but B,; does not include the method C, de- 
termined by x(t) =¢ and hence B, does not include #. The LeRoy method L 
includes the regular Euler methods E, (Morse [15]) and the regular Cesaro 
methods C, (Garabedian [6]); but, if one may judge from the difficulty in 
showing that L includes C,, it must be difficult to determine the extent to 
which L includes Hausdorff methods. That L does not include & is a corol- 
lary of Theorem 8.4. 

It follows from Theorem 1 that the series }-z* is non-summable 3 for 
each z for which Rz >1.This result creates a strong presumption that X is 
ineffective outside the Borel polygon of a power series }.a,2". If a point Zo 
lies outside the Borel polygon of a series }a,2", then 2» lies outside the circle 
of convergence and accordingly the series }>a,2 has unbounded partial 
sums. 

We now show existence of series with bounded partial sums s, which 
are not summable %. The series are gap series with large gaps. Let 
No, M1, be a sequence of integers for which +--+ and 


(6.2) Np+1/Np—> © 


as Let bi, bo, bea bounded divergent sequence of complex num- 
bers, and let >>, be the series whose partial sums So, s:,-- > are defined by 
the formulas 


(6.21) Sz = by, p =1,2,---. 


Let H(x) be regular; we show that Yun is non-summable X by showing that 
it is non-summable H(x). Letting ¢, denote the H(x) transform of }>us, 
and setting m,=n,—1, we find that when p>1 


o(m,) = f Cay 1)" 


and 


ANALYTIC EXTENSION 


a(m,) s(m,) = f 


— #)™-*[s, — s(m,) ]dx(t) 
(6.3) 
=} — — s(m,) 
0 k=O 


so that, where M is a constant for which | b.| = M for each k=0, 1, 2,---, 
1 mp-1 
| o(m,) — s(m,)| 2M — 
0 
Let e>0. Choose 5>0 such that 
3 
2M f | dx(t) | < 
0 


Let 0<0@<6 and choose an index P such that m,1<0m, when p2P. Then 
when p2P 


| o(ms) — s(my)|</2+2M ¥ dx(s|. 
& k<Om, 


If Li, L2,-++ is a sequence of positive constants for which L,—« and 
and L,/n-—0, say L,=n/?, we can use the elementary inequality (see, for 
example, Hausdorff [9, p. 104]) 


1 
Caxt*(1 — Ss 


k<k(n,t) 


where k(n, t) =n{t—(L,/n)"}, to obtain, when ? is sufficiently great, 


2M 1 
| o(m,) — s(m,) | < + 4L(m,) | dx(t) |. 


Since the last term is less than ¢/2 when ? is sufficiently great, and since 
s(m,) =b,, this implies that 


(6.4) lim | o(m,) — b,| = 0. 


Thus divergence of the sequence 5, implies that of the sequence ¢, and ac- 
cordingly Dun is non-summable H(x). 

7. A Tauberian theorem for Hausdorff methods. It is easy to amplify the 
work of the preceding paragraph to establish the following Tauberian gap 
theorem in which there is no order condition placed upon the non-vanishing 
terms or partial sums of the series. 


THEOREM 7.1. If mo, m, m2,-+- is a@ sequence of integers for which 
and 


1942] 231 


4] 


| 


232 R. P. AGNEW [September 


© 
as p— ©, if > u, is a series for which 
Un = 0, F No, M1, , 
and if is summable 3, then up is convergent(*). 


The hypothesis implies existence of a sequence ;, b2,--~- such that the 
partial sums So, s:1,--°- of Dun satisfy (6.21). Let H(x) be a regular Haus- 
dorff method by which }~, is summable. In case the sequence b, is bounded, 
we can proceed exactly as above to obtain (6.4). Convergence of o, then im- 
plies that of b, and hence that of }>u,. It remains for us to show that the 
sequence b, must be bounded. For each p=1, 2, --- let M, denote the maxi- 
mum value of |s,| when 0<k<m, where, as above, m,=n,—1. Then we 
can obtain (6.3) and conclude that 


1 mp-1 
| o(m,) — s(m,)| 2M, f DX — | dx(Z) |. 
0 k=O 
Choose 6>0 such that 
f 
0 
and then choose an indéx P such that 
1 mp-1 
f DX — dx(t)| < 1/5, p2P. 
& 


Then 
| o(m,) — s(m,)| S (4/5)Mp, p2P. 


If the sequence }; is unbounded, then M,— © as p— © and there is an infi- 
nite set of indices p2P for which | s(m,) | = M,. For such values of », 
|o(m,)| =M,/5. This is inconsistent with the hypothesis that }-, is sum- 
mable H(x); hence the sequence b, must be bounded and Theorem 7.1 is 
proved. 

8. Criteria involving zeros of moment functions. Let H(x) be regular, and 
let 


1 
wis) = f 520, 
0 
be the moment function determined by x(t). The function u(z) is continuous 


in the closed half-plane Rz 20 and is analytic in the open half-plane Rz>0. 


(*) For a Tauberian theorem involving a subclass of the regular Hausdorff methods, see 
Pitt [17, pp. 280-284] and Agnew [3]. 


1942] ANALYTIC EXTENSION 233 


Several investigations have shown that the zeros of u(z) play a fundamental 
role in the theory of transformations of the form H(x). This applies, in par- 
ticular, to the relation between H(x) and generalized Abel summability A* 
used by Silverman and Tamarkin [20]. Let a series > u, with partial sums S$, 
be called summable A* to o* if the series 


> (1 — z)s*s, 
k=O 


has a positive radius of convergence and defines, by analytic extension along 

radial lines from the origin, a function o*(z) such that o*(z) exists when ' 
0<z<1 and o*(z)—0* as z—1 over the set 0<z<1. It was shown by Silver- 

man and Tamarkin [20] that if 4(z) has a zero with real part positive, then 

A* does not include H(x); and that if u(z) has a zero with real part 0 and a 

certain supplementary condition on x(t) is satisfied, then again A* does not 

include H(x). In this section we obtain some results involving zeros of yu(z) 

on the critical lineRz =0; in particular we remove the supplementary condi- 

tion of Silverman and Tamarkin by proving the following theorem. 


THEOREM 8.1. If x(t) is regular and if u(2) has a zero q for which Rq =0, 
then A* does not include H(x). 
Our results are obtained from consideration of the sequence 
(@) log k 
Sk é 


= 


in which g is a complex number with Rg20 and g0. ag H(x) be regular, 
and let o” denote the H(x) transform of the sequence s. Then 


(8.11) = Snlt)dx(t) 

where 

and 

(8.13) Sf) = 


The function f(t) is bounded, is continuous over 6S <1 for each 6>0, and 
is continuous at t=0 if and only ifRq>0. For each n, the function f,(¢) is 
(see Hausdorff, [9, p. 104]) the mth Bernstein polynomial determined by IO, 
and f, (¢) converges uniformly to f(t) over each interval 0< 6 <1. Let e>0. 
Choose 6>0 such that 


(8.14) | dx(t) | < 


234 R. P. AGNEW . [September 


Then, since |f,(t)| $1 and |f()| $1 when and n=0, 1, 2,---, 


1 
(8.15) < | fa(t) — f(t) || dx(e) | + | fu(t) — | | 


[max | 10 ] f "| |. 


This implies that, as n— ©, the superior limit of the first member of (8.15) 
is less than or equal to ¢ and hence 0. Hence we can let n— © in (8.11) to 
obtain 


1 
no nt 0 
Suppose now that Rg=0 but q~0, say g=iy where y is real and y~#0. Then 
the sequence 


is a sequence of points, on the unit circle, having the unit circle for its set 
of limit points. If u(g) =0, then (8.16) implies that o—0; but if u(q) <0, 
then o,” is a divergent sequence whose limit points constitute a circle with 
radius | u(q)|. Thus we obtain the following theorem which has several ap- 
plications. 


THEOREM 8.2. If H(x) is regular and Rq=0 but q~0, then the sequence 
s@ =k is summable H(x) if and only if u(q) =0. 


We now use Theorem 8.2 to prove Theorem 8.1. Let H(x) be regular and 
let q be a zero of u(z) for which R(qg) =0. Since regularity of H(x) implies 
that u(0) =1, we have g~0. Hence, by Theorem 8.2, the sequence k¢ is sum- 
mable H(x). The sequence k* is the sequence of partial sums of a divergent 
series )/u, for which n|u,| is bounded; hence an elementary Tauberian theo- 
rem implies that the sequence k is not summable by the ordinary Abel 
method A. Since the series }k*z* has radius of convergence 1, it follows that 
the sequence k* is also not summable by the generalized Abel method A*, 
Existence of the sequence summable H(x) but not A* shows that A* does 
not include H(x) and establishes Theorem 8.1. 

Another consequence of Theorem 8.2 is set forth in the following theorem 
in which it is not assumed that the transformation H(x:) has an inverse. 


THEOREM 8.3. If H(x2) and H(x:) are two regular Hausdorff methods such 
that H(x2) includes H(x1), then each zero of the moment function (2) of x(t) 
with real part 0 is also a zero of the moment function 2(z) of x2(t). 


A 
i 
| 


1942] ANALYTIC EXTENSION 235 


To prove this theorem, let g be a zero of yi(z) for which Rg=0. Then 
q#0, and Theorem 8.2 implies that the sequence k¢ is summable H(x:). 
Hence k* is also summable H(x2) and Theorem 8.2 implies that p2(q) =0. 

In case H(x1) and H(x2) satisfy the hypothesis of Theorem 8.3 and H(x:) 
has an inverse [H(x1)]-*, then the transformation H(x2)[H(x:)]-' is a regu- 
lar Hausdorff transformation H(xs) such that H(x2) = H(xs)H(x:), and the 
moment function ys(z) of x3(t) is such that 


(8.31) = z20. 


This result was established by Hurwitz and Silverman [13] for the case in 
which pi(z) and pe(z) are analytic at © and in a half-planeRz > —a for some 
a>0; the extension to the more general regular transformations was made 
by Hille and Tamarkin [11] and by Garabedian, Hille, and Wall [7]. It 
thus appears that if H(x1) has an inverse, then the conclusion of Theorem 8.3 
is a trivial consequence of (8.31). It would be interesting to know whether 
Theorem 8.3 could be strengthened by establishing existence of a regular 
moment function ys(z) satisfying (8.31) even when H(x1) does not have an 
inverse. 

The sequences k?, in which Rq=0, ¢#0, are not the only sequences of 
which the question of summability H(x) is settled by vanishing or non- 
vanishing of a single moment of x(#). Let » be a positive integer and let 


se = ki/(k — k=0,1,2,---, 


where 1/(k—>p)! is interpreted to be 0 when k=0,1,--- , p—1. Let H(x) be 
regular, and let o” ®) denote the H(x) transform of o, Then 


k= k= 
(n — p)! (n — p)! 


so that 


on t dx(t) = u(P); 


this formula was obtained by Silverman [19] by another method. It is a 
consequence of this formula that the sequence !/(n—p)!,:in which p is a 
positive integer, is summable by a regular Hausdorff method H(x) if and 
only if u(p) =0, u(z) being the moment function of x(t). This implies that if 
H(x2) and H(x:) are regular Hausdorff methods with moment functions 
wa(z) and yi(z), and if H(x2) includes H(x:), then each positive integer zero 
of 41(z) must be a zero of p2(z). 


236 R. P. AGNEW , [September 


We are now in a position to prove the following theorem which, in par- 
ticular, implies that the LeRoy method L does not include X. 


THEOREM 8.4. No totally regular method of summability can include XK 


A method T of summability is totally regular if it is regular, and if also 
the T transform of each real sequence s, for which s,—+ © diverges to + ©. 
Let p be a positive integer and let H(x) be a regular Hausdorff method such 
that the moment function yu(z) of x(¢) vanishes when z = p. Then the sequence 
n!/(n—p)! is summable H(x) to 0. Hence the sequence is summable & to 0. 
But when T is totally regular, the sequence is not summable T. This proves 
Theorem 8.4. 


REFERENCES 


1. R. P. Agnew, The behavior of bounds and oscillations of sequences of functions under regular 
transformations, these Transactions, vol. 32 (1930), pp. 669-708. 

2. ———, The effects of general regular transformations on oscillations of sequences of func- 
tions, these Transactions, vol. 33 (1931), pp. 411-424. 

3. , Some remarks on a paper entitled “General Tauberian theorems,” Journal of the 
London Mathematical Society, vol. 15 (1940), pp. 242-246. 

4. E. Borel, Legons sur les Fonctions Entiéres, Paris, 1900. 

5. G. Doetsch, Uber die Summabilitét von Potensreihen auf dem Rande des Borelschen Sum- 
mabilitdtspolygons, Mathematische Annalen, vol. 84 (1921), pp. 245-251. 

6. H. L. Garabedian, On the relation between certain methods of summability, Annals of 
Mathematics, (2), vol. 32 (1931), pp. 83-106. 

7. H. L. Garabedian, E. Hille, and H. S. Wall, Formulations of the Hausdorff inclusion 
problem, Duke Mathematical Journal, vol. 8 (1941), pp. 193-213. 

8. H. L. Garabedian and H. S. Wall, Hausdorff methods of summation and continued frac- 
tions, these Transactions, vol. 48 (1940), pp. 185-207. 

9. F. Hausdorff, Summationsmethoden und Momentfolgen, I and II, Mathematische Zeit- 
schrift, vol. 9 (1921), pp. 74-109 and 280-299. 

10. E. Hille, Essai d'une bibliographie de la representation analytique d'une fonction mono- 
géne, Acta Mathematica, vol. 52 (1929), pp. 1-80. 

11. E. Hille and J. D. Tamarkin, Questions of relative inclusion in the domain of Hausdorff 
means, Proceedings of the National Academy of Sciences, vol. 19 (1933), pp. 573-577. 

. 12. W. A. Hurwitz, Report on topics in the theory of divergent series, Bulletin of the American 

Mathematical Society, vol. 28 (1922), pp. 17-36. 

13. W. A. Hurwitz and L. L. Silverman, On the consistency and equivalence of contain defini- 
tions of summability, these Transactions, vol. 18 (1917), pp. 1-20. 

14. K. Knopp, Uber das Eulersche S 7 rfahren, | and II, Mathematische Zeit- 
schrift, vol. 15 (1922), pp. 226-253 and vol. 16 (1923); pp. 125-156. 

15. D. S. Morse, Relative inclusiveness of certain definitions of summability, American Jour- 
nal of Mathematics, vol. 45 (1923), pp. 259-285. 

16. E. Phragmén, Sur le domain de convergence de l'intégrale infinie [> F(ax)e~*da, Comptes 
Rendus de |’Acadégie des Sciences, Paris, vol. 132 (1901), pp. 1396-1399. 

17. H. R. Pitt, General Tauberian theorems, Proceedings of the London Mathematical 
Society, (2), vol. 44 (1938), pp. 243-288. 

18. H. Rademacher, Uber den Konvergensbereich der Eulerschen Reihentransformation, 
Sitzungsberichte der Berliner Mathematischen Gesellschaft, vol. 21 (1922), pp. 16-24. 


1942] ANALYTIC EXTENSION 237 


19. L. L. Silverman, On the omission of terms in certain summable series, Recueil Mathé- 
matique, vol. 33 (1926), pp. 375-384. ; 

20. L. L. Silverman and J. D. Tamarkin, On the generalization of Abel’s theorem for certain 
definitions of summability, Mathematische Zeitschrift, vol. 29 (1928), pp. 161-170. 


CorNELL UNIVERSITY, 
ITHaca, N. Y. 


THE SPECTRUM OF LINEAR TRANSFORMATIONS 


BY 
EDGAR R. LORCH 


I. INTRODUCTION 


The determination of the structure of a linear transformation T which 
maps a complex vector space % into itself is naturally a primary object of the 
theory of linear transformations. The first stage of such a study centers about 
the question of reducibility. The concept may be introduced in various ways. 
In one formulation it is required to find all projections (bounded idempotent 
transformations) P which commute with 7. At a more incisive level, one seeks 
those pairs of manifolds J? and MN (not necessarily disjoint) which “split” the 
space and are transformed into themselves by T. For transformations in gen- 
eral vector spaces the known results all refer to special types such as the com- 
pletely continuous or the weakly almost periodic. This paper will deal almost 
exclusively with the first type of reducibility of the general bounded trans- 
formation. The boundedness of T is assumed for convenience; in case merely 
its closure is hypothesized the salient features of the theory are still valid. 

The results are all based on one method, that of a contour integral of the 


resolvent of T. They seem to exhaust the possibilities for this particular tool. 
Means for cracking the spectrum directly would undoubtedly have to be of 
a much more delicate nature. The fundamental projection is the integral 


1 dt 
P=— 
2ridco —T 


evaluated over a simple closed contour lying entirely in the resolvent set 
of T('). An integral bearing suggestive resemblance to this one is used by 
E. Hille in an analysis of semi-groups of linear transformations(*). The range 
of P consists entirely of those elements in 8 associated with the spectral val- 


Presented to the Society, May 2, 1941; received by the editors March 24, 1941, and, in 
revised form, July 16, 1941. 

(#1) This transformation is well known in the theory of finite matrices. It was introduced 
by Frobenius, Uber die schief Invariante einer bilinearen oder quadratischen Form, Crelles Journal, 
vol. 86 (1879), pp. 44-71. The construction of P and the demonstration of its properties in this 
case are quite simple and are carried out by methods characteristic of that theory. 

(?) E. Hille, Notes on linear transformations. 11. Analyticity of semi-groups, Annals of Mathe- 
matics, (2), vol. 40 (1939), pp. 1-47. Asa principal problem of this paper is one in interpolation, 
Professor Hille uses an integral of the type T* = (2i)/o¢*(¢I —T)—“dt where a is a complex 
power and C represents a simple closed curve essentially containing the entire spectrum of T in 
its interior. Thus the emphasis here is on the factor {*. In our case, on the contrary, the choice 
of the curve C is paramount. Added November 12, 1941. 


238 


LINEAR TRANSFORMATIONS 239 


ues lying within the curve C. Similarly the spectrum of T over the range of 
I—P lies outside of C. If one associates the projection P with the point set 
interior to C one has an instance of a mapping of so-called spectral sets upon 
projections. It is shown that this mapping is a homomorphism. A spectral set 
is any set obtained from the interiors of curves C by the operations of com- 
plementation, addition, and intersection performed a finite number of times. 
The attempt to extend the homomorphism so as to admit of denumerable 
addition and intersection has to be abandoned even if the space % is required 
to be reflexive and the more general type of reducibility (vide supra) is con- 
sidered. 


II. THE FUNDAMENTAL PROJECTIONS 


Notations and definitions. The notations used are these: Complex num- 
bers are denoted by a, 8, y, £, A, &, and so on; elements of the space 8 by f, g, h, 
and so on; linear transformations of 8 into a subset of itself by T, S, I (the 
identity), 0 (the zero), P, and so on. 

The underlying space 8 is a normed linear complex vector space, that is, 
a complex Banach space. The transformations T are assumed to be linear or 
distributive, T(af+ 8g) =aTf+ Tg. Unless explicitly stated, T is bounded, 
|| 7f|| <K\|fl|, for arbitrary fEB; the bound of T is denoted by | 7]. 

A number ) is said to belong to the resolvent set R of T if the transforma- 
tion T—AI maps B upon itself in a one-to-one manner. It results from this 
fact that (T—dJ)— is a bounded linear transformation(*). 

If for a given A, Tf—Af=0 for an f#0, d is said to belong to the point 
spectrum of T. If \ belongs neither to the point spectrum nor to the resolvent 
set, the transformation T—XJ may be inverted and has a range which is either 
dense in 8 (but not identical with B), or is not dense in %. In the first case X 
is said to belong to the continuous spectrum of T; in the second, d is said to 
belong to the residual spectrum of T. If X is in the continuous spectrum, the 
transformation (T—AJ)“! is unbounded and there exist elements f,C%, 
n=1,2,---, such that |(f,|| =1, ||(7—ADf,||+0. The collection of values 
in the point, continuous, and residual spectra is called the spectrum § of T. 
These three classes in addition to the resolvent set are mutually exclusive and 
all inclusive in the complex plane. 


Some elementary theorems. In this section some elementary theorems are 
stated. 


THEOREM 1. The value d is in the resolvent set (spectrum) of T if and only 
if the value ) is in the resolvent set (spectrum) of the adjoint T of T. If Nis in the 
point spectrum of T, is in the point or residual spectrum of T. If d is in the 
residual spectrum of T, ) is in the point spectrum of T. If \ is in the continuous 
spectrum of T, \ is in the continuous or residual spectrum of T. If the space B 


(*®) Banach, Théorie des Opérations Linéaires, p. 41. 


240 E. R. LORCH , [September 


is reflexive and is in the continuous spectrum of T, then, d is in the continuous 
spectrum of T(‘). 


The theorem is an immediate consequence of the definition of the adjoint 
transformation and of some elemental properties of linear transformations. 


THEOREM 2. If X belongs to the resolvent set R of T, then so do all § with 
|¢—r| <1/|(T—AD-|. Thus R is an open set and the spectrum § is a closed 
set. The transformation (T —{I)— may be expressed “analytically” in terms of 
(T—AI)— by means of the formula(): 


The bound of (T —{I)— is a continuous function of ¢. 


No proof will be given. 

That the resolvent set is not empty may be shown as follows: Let A be so 
chosen that |A| >| Then since ||(7—ADf|| is bounded away from zero, 
\ is neither in the point nor in the continuous spectrum. By Theorem 1 and 
the fact that | 7| =|7|, \ is not in the residual spectrum. Thus \GR. The 
fact that§ is not empty will be shown later. 


THEOREM 3. If \ and yp are any two values in the resolvent set R of T, then(®) 
(2) (I — T)-* — (ul — T) = (uw — T) "(el — 


In proof, multiply both sides of (2) by (AI—T)(uJ—T). 

Let T({) be a function defined over some set of complex numbers { and 
whose values are bounded linear transformations of % into itself. For such a 
transformation depending continuously upon a parameter will be defined a 
curvilinear integral along a given rectifiable curve C. The function 7(£) is 
said to be continuous if it is continuous in the uniform topology, that is, 
| T($1) —T(f2)| is small with | 


LemMaA. Let C be a rectifiable curve in the complex plane, §=(t), OStS1» 
and T(\) be a transformation depending continuously upon a parameter \ and 
defined on C. Then the Riemann integral 


(3) A= [rea 


(*) Some of these statements may be found established for Hilbert space by Stone, Linear 
Transformations in Hilbert Space and Their Applications to Analysis, American Mathematical 
Society Colloquium Publications, vol. 15, chap. 4. 

(5) This is the well known Neumann expansion of the resolvent. The fact that this formula 
holds for a general transformation in a Banach space has been noted by A. E. Taylor, The 
resolvent of a closed transformation, Bulletin of the American Mathematical Society, vol. 44 
(1938), pp. 70-74. In the same paper one will find our Theorem 3. ; 

(*) This is the well known functional equation for the'resolvent. 


1942] LINEAR TRANSFORMATIONS 241 


exists and defines a bounded linear transformation. The bound of A satisfies 
| A| | where lis the length of C. If the transformations are 
commutative, then A is commutative with T(¢). 


The integral is defined in the classic way. Let tp)=O0<t4i< --- <t,=1bea 
subdivision of the unit interval. Let ¢/ with ¢;.,:S#/ St,,i=1,---,mben 
points, one in each subinterval. Let ¢/ =¢(t/) and Af;=9(t;) Then 
an approximating sum to A is PDs (¢/ )Ag;. The proof of the convergence 
is omitted. The remaining statements of the theorem are clear. 

The fundamental projections. In this section will be introduced the pro- 
jections which underlie the entire investigation. 


THEOREM 4. Let C be a simple closed rectifiable curve lying entirely within 
the resolvent set R of T. Then the contour integral 


1 d 
(4) p=—f 
Cc tI —T 


exists and represents a bounded linear transformation commutative with T, 
PT=TP. The transformation P is unchanged if the curve C is continuously de- 
formed into a curve C’, providing only that the — 1s effected without 
going outside of the resolvent set. 


It is hardly necessary to state that the transformation {I—T appearing 


in the denominator of (4) represents ({J-—T)-'. By Thecrem 2, | (¢I- T)-| 
is a continuous function of {; hence by the lemma, the integral (4) exists de- 
fining a bounded transformation P. Note that the direction of integration 
along C is counterclockwise. 

Let \ be in R and K be a rectifiable closed curve simple or not, lying en- 
tirely within the £-circle =r<i/ | (T . The integral 


dg 
(S) tI —T 


is computed with the help of series (1); termwise integration shows that its 
value is zero. Now the curve C may be deformed into the curve C’ with the 
help of a finite number of curves of the type K. Thus the integral (4) evalu- 
ated over C is identical with that integral over C’. 


THEOREM 5. Let C and C’ be two simple closed rectifiable curves lying en- 
tirely within the resolvent set R of T. Let P and P’ be the integrals (4) associated 
with C and C’, respectively. If the curve C lies entirely within the curve C' then 


(6) PP’ = P'P = P. 
If the curves C and C’ lie each exterior to the other, then 
(7) PP’ = = 0. 


| 
‘ 
| 
E 


242 E. R. LORCH [September 


That PP’ =P’P is clear from the integral definition of both P and P’ which 
brands them as “functions of 7.” 
By virtue of Theorem 3, one may write for an arbitrary pair of curves C 


and C’ 
1 \? 1 
PP’ = (|) f f . d¢dt 
f f ( - dgdé. 
JodaN\ti-T 
If the last integral is broken into two parts, the results of the theorem are 
readily obtained from a trivial integration in the complex plane. 
It should be noted that the conditions imposed by the present theorem 
on C and C’ could be weakened without impairing the results (6) and (7). 
It is sufficient for (6) to require that C and C’ be deformable within R to 
two new curves which lie one inside the other. Likewise, (7) holds if C and C’ 
are deformable within R to two new curves which lie exterior to each other. 
All this is possible by virtue of Theorem 4. 


The next theorem announces the most important property of the trans- 
formation P. 


THEOREM 6. The transformation P of Theorem 4 is a projection, that ts, 
P*=P. Furthermore, it reduces T, that is, PT=TP. 


Let C’ be a curve in R such that the curve C defining P lies entirely in 
the interior of C’ and such that C and C’ may be deformed into each other 
within R. Let P’ be the transformation (4) associated with C’. By Theorem 4, 
P=P’. By Theorem 5, PP’=P. Hence P?=P. 

That PT=TP has already been stated in Theorem 4. The significance of 
the role of P relative to T now becomes clear. The space 8 may be thought 
of as the direct sum of two spaces M and N, B=M+N, with M the totality 
of elements f€% for which Pf=f,‘and with N the totality of elements gE 8 
for which Pg=0. Since PTf{=7TPf=Tf, TMCM. In the same manner 
TNCMN. Thus the study of T in B reduces to the study of T in the two 
spaces and MN. 

A special case. A special case of considerable interest throws light on the 
character of the elements f for which Pf =f (or Pf=0). This character is com- 
pletely revealed by the behavior of the sequence of iterates T*f. — 


THEOREM 7. Let the unit circle C with center at the origin of the complex plane 
lie entirely in the resolvent set R of T. Then the projection P defined by (4) satis- 
fies the relation 


(8) P = lim (J — 


1942] LINEAR TRANSFORMATIONS 243 
Furthermore, Pf =f if and only if lima. || Tf|| =0. And if Pf=0, f0, then 
|| T*f|| = 

If T—' exists, the projection associated by (4) to the transformation T—' and 
for the unit circle Cis I—P where P is as above. The elements f for which Pf =0 
are precisely those for which limy.. || T~*f|| =0. 

In addition, if f (0) is such that lim, .« || T*f|| =0, then lim, .. || T-*f|| =~. 
If f (x0) is such that lim, ..|| T-*f|| =0 then lima. || =o, 


The integral (4) in the special case indicated by the present theorem may 
be evaluated in the following manner: Let a=exp (27i/n). Consider as an — 
proximation to the required integral the finite sum 


1 n—1 
— (ail — T)“(ai#! — avi). 


jm0 


Replace this sum by its equivalents 


j=0 
Since lim,...2(a—1) =2mi, equation (8) has been derived. 
The relation 


(9) I (1 


may be verified immediately by multiplication with (I—7™). It implies that 
if limy.. || 7*f|| =0, then since | (J—7")-"| is bounded (by (8)), it follows that 
=0. Applying (9) once more, this means lim,.... (J T")“f 
=f. In resumé, if lim,... || 7f|| =0, then Pf=f. 

Now, assume that Pf =f. Write (J—7*)-!=(Q, and let mo be so chosen that 
|P—Q,| <e<1/2 for n>mo. Let Qnf=gn. Then since PT"=T*P, PT*f=T"f, 
and (9) may be written 


—f=Tf+ Qn. — P)T*f. 


Upon taking norms, this yields Finally, 
S¢/(1—)||f|| for n>mo. This shows that lim,... || T*f|| = 

If Pf=0, (9) yields 0,f=f+Q,7"f=f+(P—Q,) and hence 
since. PI*f=T*Pf=0, ||Q,f|| This gives || 
which means either f=0 or lim,... || T*f|| = ©. 

If 7-' exists, the complex unit circle is in the resolvent set of T-' and (9) 
shows that lim,... (I— 7T-")=I—P. Proof of the remaining statements of 
the theorem may now be based on what has preceded. 

Some elementary properties of the fundamental projections. The relation- 
ship between the spectrum of T and the projection P is given in the next theo- 
rem. 


— 


244 E. R. LORCH ° [September 


THEOREM 8. The projection P defined by the curve C as in Theorem 4 has 
the following properties: 

(a) If Nis in the point spectrum of T, Tf with f #0, then Pf =f or Pf =0 
according as \ lies within C or without C. 

(b) If X is im the continous spectrum and {f,} is a sequence of elements 
in with the properties ||f,||=1, n=1, 2,---, lima. || =0, then 
|| Pfn—fal| =0 if lies within C; limy || Pafl] =0 if Lies without C. 

(c) The equation P =0 is valid if and only if every point in the interior of C 
lies in the resolvent set of T. 

(d) The equation P=I is valid if and only if every point in the exterior of C 
is in the resolvent set of T. 


The proof of (a) is immediate; that of (b) rests on the existence of ele- 
ments f for which Tf=Af approximately. In (c) if the interior of C is in R, 
C may be shrunk to a point, hence P=0 by Theorem 4. If P=0, application 
of (a) and (b) excludes the point and continuous spectrum from the interior 
of C. The residual spectrum is treated by reverting to the space (%). For (d), 
if P=TI, clearly, the exterior of C lies in R. In the converse case, assume C 
to be a circle with center in R and map its exterior upon the interior of the 
unit circle, then apply Theorem 7. 

A by-product of this theorem is that every transformation has at least 
one point in its spectrum, a fact that has already been noted(’). 


III. THE HOMOMORPHISM OF SPECTRAL SETS AND PROJECTIONS 


Spectral sets. Let C be a simple closed rectifiable curve lying entirely in 
the resolvent set of T. The curve C determines two sets of points, its in- 
terior C‘, and its exterior C*. The totality of sets formed from the sets C* 
and C* (for all possible curves C) by the operations of complementation, finite 
set addition, and finite set intersection forms a Boolean algebra of sets. Any 
set in this algebra of sets will be called a spectral set of T. 

With each spectral set M will be associated a projection Py» which re- 
duces T. The method of assigning Py to M will be the following: To the 
interior C‘ of C will be assigned the projection P of Theorem 4. To the ex- 
terior C* of C will be assigned the projection I—P. To the set complements 
of C‘ and C* will be assigned the projections J—P and P, respectively. To 
the intersection of the sets Cy, ---, Cy", where the u; are symbols to 
denote an “e,” an “i,” or the complement of an “e” or “i,” will be assigned. 
the product of the corresponding projections. As the projections are com- 
mutative, the order of the terms in the product does not play any role. To 
a sum Cy'+C?+ --- +C* will be assigned the “starred sum” of the corre- 
sponding projections. The starred sum of the commutative projections 
P,,-+-, P, is defined by Pi+ --- +P,=I—(I—P,) -- - (I-P,). 


Taylor, loc. cit. 


1942] LINEAR TRANSFORMATIONS 245 


Proceeding in this fashion, it is seen that if a spectral set M is defined in 
a specific way by the formation of sums and intersections of the elemental 
sets Cj’, a method is specified for forming the projection associated with M. 
That procedure consists in replacing in the definition of M the sets Cj’ by 
their projections, the operation of set intersection by that of projection multi- 
plication, and the operation of set addition by that of projection starred addi- 
tion. Finally if the projection P is associated with the set M, the projection 
I—P will be associated with the set complementary to M. Thus correspond- 
ing to a specified construction for a spectral set M, there is precisely one pro- 
jection P defined. 
Although the projection associated with M seems to depend on the par- 
- ticular manner in which M is constructed, this is not the case. The proof, 
which will not be set down, proceeds along these lines. In any two methods 
of defining M, one replaces the auxiliary curves C; by simple polygons D; 
which are obtained from the C; by deformation. These D; generate in the 
plane a finite number of polygonal regions Q;. The set M by either method 
of definition consists (up to deformations) of certain of these regions. Finally, 
any region which lies in one deformation of M but not in the other lies in R. 
Henceforth the projection associated with M will be denoted by Py. 


THEOREM 9. The mapping M—P y of the set algebra of spectral sets on the 
‘associated class of projections is a homomorphism. Under the homomorphism 
Pmu=0 if and only if M lies entirely in the resolvent set of T. 


The establishment of the homomorphism is virtually accomplished by the 
discussion just preceding. Let M; and M; be two spectral sets. Then one 
method of defining the projection associated to Mi+M:2 (or Mi: M2) is 
But this is precisely the homo- 
morphism property. It is to be noted that since the projection associated with 
the whole plane is the identity, the projection Py» associated with the set 
complement M of M satisfies Py =I—Py. That the sets M for which Py =0 
are precisely those which contain no points of § is a consequence of the 
uniqueness argument just preceding this theorem. 

Spectral components. To obtain an idea of the resolving power of spectral 
sets, that is to say, of their ability to separate various parts of the spectrum, 
a few facts concerning sets may be recalled. Since the spectrum of a bounded 
transformation is closed and compact (or bounded) and since furthermore any 
closed compact set may serve as the spectrum of some bounded transforma- 
tion, it is the general closed compact set which must be discussed. If M is 
such a set and a is any point in M, the set sum of all continua in M contain- 
ing @ is a continuum called a component of M(*). Thus M may be expressed 
as the sum of a finite or infinite number of distinct components. Since a com- 
ponent is a closed compact set, each two components lie at a positive distance 


(*) Hausdorff, Mengenlehre, p. 152. 


| 
| 
| 
| 
| 
| 
| 


246 E. R. LORCH ; [September 


from each other. Furthermore, if a simple closed curve has no point in com- 
mon with a given component, that component lies entirely inside or outside 
the given curve. 

If K, and Kz are two distinct components (assuming that M consists of 
more than one component), then there exist closed sets M, and M2 such that 
M=M,+M:2, MiDKi, Finally, for the sets M, Mu, 
and M; there exists a finite set of polygonal domains having the properties: 
No point of M is on the boundary of any domain; within any domain there is 
at least one point of M,; all the points of M; lie without these domains(?*). 
If L represents a domain containing one point of K;, L contains K; in its en- 
tirety. Thus L is a spectral set separating K,; from Ke. 

The non-extensibility of the homomorphism. A natural question to pro- 
pose is whether the class of spectral sets M and the projections Pm can be 
enlarged in such a way that an extended homomorphism reigns between the 
two extended classes. Specifically, can this extension be carried out in such a 
fashion that the new homomorphism is valid for denumerable sums and in- 
tersections? Preliminary considerations show that it will be wise to forego the 
insistence that a set M correspond to a projection P and substitute the re- 
quirement that M correspond to a pair of closed linear manifolds { Mm, Nm} 
having the properties: Ptm and 3m have no common element except 0; to- 
gether they span %. The notion of a projection Py is a special case of this 
type where My and Ny are disjoint(!!). Mm is the set of elements {P-f}, 
fEB; Nam is the set of elements {f—Puf},fEB. The general type of manifold 
pair described above seems to play the leading role in the theory of rota- 
tions—inter alia—in reflexive spaces(??). 

Further considerations suggest that the heretofore unqualified nature of 
the underlying space be restricted suitably so that it should have the “correct” 
properties relative to infinite intersections and sums of closed linear mani- 
folds. Reflexive spaces seem to possess precisely the requisite properties(’*). 


(*) R. L. Moore, Foundations of Point Set Theory, American Mathematical Society Col- 
loquium Publications, vol. 13, p. 21, Theorem 35. 

Kerékjart6, Vorlesungen ber Topologie, p. 31. 

Mag and Ny are disjoint if the elements f+g, fE Mm, gE are not only dense in B 
but actually fill B. 

(#2) See the author’s The integral representation of weakly almost-periodic transformations in 
reflexive vector spaces, these Transactions, vol. 49 (1941), pp. 18-40. 

(4) If (B) represents the space adjoint to %, then the reflexive property is defined by 
((@)) =. Thus, for example, the spaces L, and I, p>1, are reflexive. An important property 
alluded to is the following: If {M,} is a monotone decreasing sequence of closed linear manifolds 
and M2 represents the largest manifold in (@) which is orthogonal to Dt, then ([].:Mtn)+ 
=) M+ where >_..,M- indicates the smallest closed linear manifold containing each M2. 

Proof. Since M+ 1 Mn, then ML 1 M, or M+ D(N) Suppose 
FE(M). Then since is reflexive, there exists an such that Ff=1 and f 1 (®). In particu- 
lar, f LM, hence fEM,, n=1, 2,--- . Thus fC M and- Ff =0. This contradiction proves that 
M+ = (N). 


1942] LINEAR TRANSFORMATIONS 247 


An example will be given of a simple transformation in Hilbert space 5 
(which is reflexive) to indicate that the desired extension of the homomor- 
hism is impossible. Let A be the linear transformation defined by the matrix 
fl. i,j=0,1,2,---, with ao;=0, j7=0, 1, 2,-+- 
and i, 7=1, 2,---. Since ©, the transformation 
is of finite norm, hence certainly bounded. The spectrum of A is found by 
examination of the equation 


(10) A(xo, ™** ) (xo, ) (Yo, ) 


where ? <0, yn|?<. It is found that if does not assume 
values in the set M= (0, 1/2,1/4,---,1/2",--- }, then (10) has a unique 
solution (xo, x1, ) for all (yo, - Thus dic values in the set comple- 
mentary to to the set of A. Also, A=1/2", n=1, 2,--- 
is found to be a characteristic value for a single vector (xo, x1,--- ) where 
X,=1, otherwise x;=0. Finally, the value \=0 is in the residual spectrum 
since Af=0 implies f=0 and since the range of the transformation A is not 
dense in space. 

For the adjoint transformation A, it is found that the value \=1/2", 
n=1,2,---, is in the point spectrum and that the one characteristic vector 
for that value is the vector (xo, x1,:--) where xo=1, x,=1/n, otherwise 
x;=0. The value \=0 is in the point spectrum of A and A(1, 0, 0,--- ) 
=(0,0,---). 

If C, represents a curve containing in its interior all points of M save 
1/2, 1/4,---, 1/2", and if P, represents the projection associated with C, 
and A, then the manifold { P,f}, fEH, includes the manifold M, of all ele- 
ments of the form (xo, 0, 0, - -- , 0, X%n41, Xn42, °° ) by Theorem 8(a). Simi- 
larly, the manifold {f— Prof }, EH, includes the manifold Te spanned by the 
n elements (1, 1/2, 0,0, ---), (1,0, 1/3, 0, ---)and (1,0, ---,0,1/n,0,---). 
As and R, span G, the manifold {Pag SED is Mrs the 
manifold {f—P,f}, is precisely N,. 

If one writes M,=NRNi and N,=Mit, one sees that the projection P, 
associated with the curve C, and with A determines the manifolds M, 
and Now Also =([]m,)+ 
={(xo, 0, 0,---)}+#. This example shows that a homomorphism for 
denumerable intersections and sums cannot be expected. Also to be noted 
is that if one attempts to obtain the linear manifold which corresponds to 
the component of M consisting of the point \=0, one obtains the zero mani- 
fold, although the transformation has no singularities in this manifold! 

As is fairly apparent, the contour integral used as above does not yield all 
possible projections which reduce 7. This may be inferred easily from ex- 
amples. In the first place, there is the possibility of multiplicity in the spec- 
trum. Or even if the spectrum is simple, the integral may not “split” any 


| 
| 
| 
| 
| 

| 
| 

i 


248 E. R. LORCH 


spectral component. Here one may refer for an example to the general rota- 
tion in a reflexive space. 

There is a way, nevertheless, of obtaining readily a somewhat finer resolu- 
tion into reducing manifolds than that exhibited above. If f is any element 
in B, let Mt, represent the closed linear manifold spanned by the elements Uf 
where U is any function of T (one may let U represent any rational function 
of T). Then TM,CM and one may proceed as above replacing B in the pre- 
vious argument by My. It is to be noted that this procedure does not supply 
one with a manifold complementary to Jt;. An obvious approach to problems 
of the type here envisaged is through a detailed study of the structure of the 
class of manifolds Pty where f ranges over B. This has not yet proved suc- 
cessful. 


BARNARD COLLEGE, COLUMBIA UNIVERSITY, 
NEw York, N. Y. 


. 

. 
‘ 


ON THE OSCILLATION OF THE DERIVATIVES OF 
A PERIODIC FUNCTION 


BY 
GEORGE POLYA AND NORBERT WIENER 


1. Let f(x) be a real valued periodic function of period 2m defined for all real 
values of x and possessing derivatives of all orders. Let Ni; denote the number of 
changes of sign of f(x) in a period. We consider the order of magnitude of N: 
as ko, 

(I) If Nx=O(1), f(x) is a trigonometric polynomial. 

(II) If N,=O(k*) where & is fixed, 0<5<1/2, f(x) is an of 
finite order not exceeding (1—6)/(1—28). 

(III) f(x) is an entire function. 


We prove this theorem by consideration of the Fourier series of f(x) 


(1) f(x) = 


c_r=€, (n=0, 1, 2,---). Here, as in what follows, the sign 8 without ex- 
plicitly stated limits means a summation from — © to . Under the present 
conditions, the series (1) is absolutely and uniformly convergent for real x, 
and so are the Fourier series of f’ (x), f’’(x), - - - , obtained from (1) by term by 
term differentiation. If we focus our attention on the Fourier series, we may 
express the general trend of our theorem by saying that a small amount of 
oscillation in the higher derivatives implies a rapid decrease in the coefficients, 
this decrease being so extreme in case (I) that all coefficients from a certain 
point onward vanish. 

The theorem we have to prove and a few analogous facts(') point towards 
a general principle which cannot yet be stated in precise terms but which is 
not entirely unsuitably expressed by saying that a small amount of oscillation 
in the higher derivatives indicates a great amount of simplicity in the analytic 
nature of the function. 

An analogous theorem may be formulated for almost periodic functions. 
As in other theorems of this kind, the number of changes of sign in a period is 
replaced by the density of these changes over the infinite line and a trigo- 
nometric polynomial is replaced by an entire function of exponential type. 
The extension of case (I) of our theorem offers the least difficulty. 


Presented to the Society, May 2, 1941; received by the editors August 8, 1941. 

(*) See S. Bernstein, Legons sur les Proprittés Extrémales, Paris, 1926, pp. 190-197 and 
Communications de la Société Mathématique de Kharkow, (4), vol. 2 (1928), pp. 1-11; R. P. 
Boas and G. Pélya, Proceedings of the National Academy of Sciences, vol. 27 (1941), pp. 323- 
325. 


249 


250 GEORGE POLYA AND NORBERT WIENER [September 


2. We start with a few preliminary remarks on changes of sign. We con- 
sider first a real-valued function f(x) which is defined in an interval agx3b. 
We say that this function has N changes of sign in this interval if it is possible 
to find N+1, and no more, abscissae xo, x1, - - - , xv in the interval such that 


(3) S(%~1) f(x) < 0, 1, 2, 3, »N. 


If a function has N changes of sign in an interval, its derivative has there at 
least N —1 changes of sign. This variant of Rolle’s theorem is easily proved by 
considering £,, such that 


and observing that f(x,)f’(&,) >0 and that therefore 


(&) < 0, y= 2, 3, N. 


Applying this to the function e**f(x) (where a is a real constant) and its de- 
rivative [e**f(x) ]’=e*[af(x)+f’(x)], we see that the number of changes of sign 
of 

(a + D)f(x) 


(where D is the symbol of differentiation) is not inferior to N—1, N being the 
number of changes of sign of f(x). 

Now let f(x) be periodic with the period 27. We say that the number of 
changes of sign of f(x) in a period is N, if it is possible to find just N+1, and 
no more, abscissae xo, x1, , Xv such that 


tn = Xo + 


and (2), (3) hold. Observe that f(xw) =f(xo) and that, therefore, N is neces- 
sarily even. Hence it follows that the number of changes of sign of (a+D)f(x) 
in a period is not inferior to that of f(x). We defined N;, in our initial state- 
ment; now we see that 


(4) 
Observing that 


Cc =z 


(a+ D)(a — = cnc, 


+ n? 


we obtain: 


Lemma I. The number of changes of sign of the function (1) in a period is 
not inferior to that of 


| 


1942] DERIVATIVES OF A PERIODIC FUNCTION 251 


3. The series (1) represents a trigonometric polynomial of order m if c,=0 
ior n=m+1, m+2,---. If f(x) is a trigonometric polynomial of order m, 
it cannot have more than 2m roots in a period; this is well known. Observe 
that for large k, f(x) has actually 2m changes of sign, because as k>~, 
(im)-*f“ (x) approaches the first or the second of the two expressions 


(5) Come + Cue*™?, Come + 


according as k is odd or even. The second of these expressions is of the form 
2| Cm| cos (mx—*), with a certain real y and the first is of the same form ex- 
cept for a factor 7. 

The case (I) of our theorem characterizes the trigonometric polynomials 
and can be stated as follows: A real-valued periodic function f(x) possessing 
derivatives of all orders is a trigonometric polydomial if and only if the number 
of changes of sign of f(x) remains bounded fork. 

In order to prove this we consider (1). We have to show that some f(x) 
have an arbitrarily great number of changes of sign if there are c,~0 with 
arbitrarily large subscripts m. More precisely we shall show this: 

If m>0 and Cm#0, then all derivatives of (1), from a certain stage onward, 
have not less than 2m changes of sign. 

In fact, by repeated application of Lemma I, we ascertain that 


(6) f(x) = 


does not have fewer changes of sign than 


m? + n? 


But since it is given that c»*0 and that }-c, is absolutely convergent, we 
have from a certain k onward 


Indeed we have for n>0, n¥m, 0<2mn<m*+n?, and therefore, each term 
tends to 0 on the right-hand side of (8) for k=. 

But if (8) holds for a certain even k, the sum in (7) has the same sign as 
the second expression (5) in all those real points x in which this latter reaches 
2|cm|, the maximum of its absolute value. This maximum is reached with 
alternating signs, in equidistant points, the distance of two consecutive points 
being +/m. Therefore (7) has not less than 2m changes of sign. We have 
proved this for even k but the same is true and the proof is nearly the same 
for odd k. Then, by Lemma I, (6) has not less than 2m changes of sign, and 
case (I) of our theorem is proved. 

4. We consider the case (III) of our theorem before case (II). 


252 GEORGE POLYA AND NORBERT WIENER [September 


If the periodic function f(x) is analytic along the whole real axis, it is ana- 
lytic in a certain horizontal strip bisected by the real axis and the Fourier 
series (1), which is a Laurent series in 


z= ef, 
converges in the interior of the strip. Hence, by examining 
(9) lim sup | 
no 


we can distinguish the following three cases: 

If f(x) is not analytic along the whole real axis, (9) has the value 1. 

If f(x) is analytic in a certain horizontal strip of width 2h bisected by the 
real axis, but in no wider horizontal strip, (9) has the value e~*. 

If f(x) is an entire function, (9) has the value 0. 

In order to prove case (III) of our theorem, we have to show that in the 
first two cases N,=0(k'/?) is excluded. We prove the following statement. 

If there exists a positive number yy such that 


(10) lim sup | c,| = ©, 


then there exists a positive number g such that f(x) has, for an infinity of values 
of k, not less than (k/g)"? changes of sign. 

By the considerations of the foregoing section, f(x) has certainly not 
less than 2m changes of sign if (8) holds. Using (10), we have to find an ar- 
bitrarily large m and a corresponding k such that (8) holds. We shall succeed 
in finding such an m by applying the following known lemma(?). 


LemMA II. We consider two infinite sequences l, l,--+,In,-++ and 
$1, , and suppose that 


(11) lL, = 0, n=1,2,3,---, 
(12) 
(13) lim 1, = 0, 


(14) lim sup/,s, = ©. 


Then there exists an infinity of integers m such that 
ben w=1,2,3,---, 


We put 


(*) See G. Pélya and G. Szegé, Aufgaben und Lehrsdtze aus der Analysis, Berlin, 1925, vol. 1, 
pp. 18 and 173, Problem 109. 


no 
no 
‘ 


1942] DERIVATIVES OF A PERIODIC FUNCTION 


| ca] = Ins = Sn. 


This choice satisfies (11), (12), (13), (14); in fact, (13) is satisfied because (1) 
is convergent, and (14) is satisfied because we have supposed (10). Thus we 
obtain an infinity of m such that 


w=1,2,3,---, 
| cmp | w=1,2,---, m—1. 


This we use to estimate the following sum. (Our ultimate aim is to prove (8).) 


2m(m— \*| 2 Im(m+u) \* | Cmte 
(15) 2m(m — p) 2m(m+u) \* 
<2 2+ (m — yu)? 
= Se. 


We introduced the abbreviations 


m—1 


2(1 + u/m) 
an +(1+p/m)) 


and we shall consider S; and 5S, in turn. 
(1) Split the sum S; in two parts, u being less than or equal to m/2 in 
the first part and greater than m/2 in the second. Using the fact that 


(1 + < 0<x<1, 


we obtain 
m/2 


m/2 
m/2 


1 
We put 


where g is a positive integer, g>*y. We choose a fixed g such that for suffi- 
ciently great m 


(19) Si < 1/2. 


253 

= 

(18) k = 4gm? 


254 GEORGE POLYA AND NORBERT. WIENER [September 


(2) The function 2x(1+<?)-! decreases for x >1. Therefore by (17) 


\1+ 2? 2 \2¢ 


We used a well known asymptotic evaluation of definite integrals(*) and (18). 
If g22, which we assume, we obtain for sufficiently great m 


(20) < 1/2. 


But (15), (19), (20) show that (8) is true so that f(x) has not fewer 
changes of sign than 


2m = (k/g)*/2. 
5. We now proceed to the proof of case (II). 


LEMMA III. The Fourier series (1) represents an entire function of the finite 
order X, X>1, if and only if 


log log (1/| 
(21) tim ing 28298 (1/1 
log n rA-1 


The proof consists of two parts. Both parts follow familiar lines; so we 
do not give all the details. 

(1) Assume that f(x) is entire and of order \. Then for a fixed positive ¢ 
and for sufficiently great |x|, 


(22) | f(x) | < 


If we evaluate c, and shift the line of integration (periodicity and Cauchy’s 
formula), we obtain as a result that if r is any positive number, 


— 


1 
(23) =— f ir + 
2r 


| Cn | s t—nr 


Here we use (22). We choose 7, for given m, so that this right-hand side of (23) 
shall be a minimum. It follows by straight-forward calculation that (21) holds 
with “>” instead of “=.” 

(2) Assume that 


, log log (1/| Ca | ) 
lim inf = 
log 


k>0. 


(*) See, for example, G. Pélya and G. Szegi, loc. cit., vol. 1, pp. 78 and 244, Problem 201. 


i 


1942] DERIVATIVES OF A PERIODIC FUNCTION 


Therefore we have, for a given positive ¢ and all sufficiently great n 
| | < t+ 


We choose n, for a given x, so that the right-hand side is a maximum. This 
maximum gives the right order of magnitude because the terms of (1) whose 
index surpasses a certain multiple of the index of the maximum term, yield 
a negligible contribution. We find that the order \ of f(x) satisfies the in- 
equality 

K 


This gives (21) with “s” instead of “=.” 

6. Now we are prepared to prove case (II) of our theorem. We have to 
show that if the entire function f(x) is of order h, and €>0 then, for an infinity 
of k, 

Ny > 


Put A/(A—1)+7=7, 7 being positive and small. By Lemma III, the fact 
we have to show can be stated as follows: 
If there exists a positive number y,y>1, such that 


lim sup | c,| = ©, 
then there exists a positive g such that f(x) has, for an infinity of k, more than 


(k/g)*!‘7+) changes of sign. 
We apply Lemma II, whose conditions are satisfied by 


=|cn|, Sn = 
We obtain the result that for an infinity of m, 
m=12,---, 
| cm cm—p| w=1,2,---,m—1, 


Hence, using the fact that y >1, we obtain 


2m(m — \*| Cm—p = 2m(m + \*| 
= Si + Se. 


S: has the same meaning as before (see (17)), and 


256 


(25) 


We put 


GEORGE POLYA AND NORBERT WIENER 


m—1 —k 


p= 2m(m — yu) 
m/2 
< > kyut/4m2 + me™™(4/5)* 
m/2 
pel 


k = 4gm7*, 


and we choose g so that 


and so that & is an integer. This choice assures that 


Si—0, 


for m— © (see (25) and the considerations preceding (20)). Therefore, by (24), 
(8) is true and f(x) has not fewer changes of sign than 


2m > 2[k/4(y + 2) or, 


BROWN UNIVERSITY, | 
PROVIDENCE, R. I. 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


. 


STRUCTURE OF LINEAR SETS 


BY 
ERNST SNAPPER 


INTRODUCTION 


In general only vector spaces whose scalar domains are fields have been 
investigated. In this paper we study vector spaces whose scalar domains are 
integral domains in which every ideal has a finite basis. It is shown that the 
theory of the linear subsets of such a vector space can be developed by using 
exactly the same technique which has always been applied to the ideals of 
an integral domain. This technique is established in the first two sections and 
is then used to develop the Noether decomposition theory of linear sets into 
“primary” linear sets. 


I. BASIS THEOREM AND ASCENDING CHAIN CONDITION 
We first define the notion of vector space: 


DEFINITION 1.1. An n-dimensional vector space consists of all the vectors 
with n components where the components form an integral domain in which every 
ideal has a finite basis. 


This integral domain is called the scalar domain of the vector space. 


DEFINITION 1.2. A linear set is a subset of an n-dimensional vector space 
closed under vector subtraction and multiplication by arbitrary scalars. 


An integral domain in which every ideal has a finite basis is clearly a one- 
dimensional vector space and its ideals are the linear sets. The ordinary theory 
of ideals will therefore become a part of this theory of linear sets. 

- Notation. The fixed n-dimensional vector space will be indicated by V, . 
and its scalar domain by R. Otherwise the capital Latin letters will be used 
exclusively to indicate linear sets and the small Latin letters to indicate vec- 
tors. The ideals of R will be indicated by German letters and the scalars of R 
by Greek letters. The radical of an ideal a will be indicated by a’. As is cus- 
tomary, a’ consists of all the scalars of which a power lies in a. 

The following theorems, whose proofs are apparent from van der 
Waerden’s Moderne Algebra, vol. 2 (we refer to this book as [1]), will be used 
frequently: 


THEOREM 1.1 (The basis theorem). Every linear set has a finite basis. 


Presented to the Society, February 22, 1941; received by the editors December 3, 1940, 
and, in revised form August 9, 1941. 


257 


258 ERNST SNAPPER - [September 


THEOREM 1.2 (The ascending chain condition). A chain of linear sets 
UiCL:C in which L; is properly contained in Li: is finite. 


THEOREM 1.3 (Maximum condition). Every non-empty set of linear sets pos- 
sesses a maximal set (that is,a linear set which is not contained in any other linear 
set of that set). 


THEOREM 1.4 (Induction by division). If a property can be proved for every 
linear set L, V,, included, under the assumption that the property can be proved 
for all the linear sets which properly contain L, the property holds for every linear 
set. 

II. PRODUCTS AND QUOTIENTS 


The technique mentioned above for treating the linear sets is based upon 
the following definitions: 


DEFINITION 2.1. The greatest common divisor or sum L=(Li+Lz) of two 
linear sets L, and Lz is the linear set L, generated by their logical sum. 


DEFINITION 2.2. The least common multiple L = [Li(\L2] of two linear sets 
Li and Lz is their intersection. 


DEFINITION 2.3. The product L=alL, of an ideal a and a linear set L, is 
the linear set L, generated by the scalar products of scalars of a and vectors of L;. 


DEFINITION 2.4. The quotient L=L,/a of a linear set Li and an ideal a is 
the linear set L, consisting of the vectors whose scalar products with all the scalars 
of aliein Li. 


DEFINITION 2.5. The quotient a=L;/Lz of a linear set L; and a linear set Lz 
is the ideal a, consisting of the scalars whose scalar products with all the vectors 
of Lz lie in Li. 


The theorems concerning these notions are the same as in the case of ideals 
and are proved in the same way. The most important ones are listed below 
(for notation see Part I): 


THEOREM 2.1. Multiplication is associative and is distributive with respect 
to addition: 
a(Li+L:) = (ali + al,). 

THEOREM 2.2. Division is distributive with respect to intersection: 

[Li N = OL,/a). 
THEOREM 2.3. The quotient of the sum is the intersection of the quotients: 
L/(a1 + = [L/ax\ L/(Mi + M2) = [L/Mi \ L/Mg]. 


‘ 


1942] LINEAR SETS 


III. CLosURE, ESSENTIAL IDEAL AND RADICAL 


The notions discussed in this section play an important role in the struc- 
ture theory of linear sets. 


DEFINITION 3.1. The closure L of a linear set L is the linear set consisting 
of all the vectors of which some scalar multiple, different from zero, lies in L. 


This definition is clearly consistent and we always have 


L I, [Li = [Z, 


DEFINITION 3.2. The linear set L will be called closed if L=L and dense if 
T=V,. 


EXAMPLE 3.1. If R consists of all the rational integers, then in V2 over R 
the set L; generated by (2, 2) is neither closed nor dense; L2=(1, 1) is closed 
but not dense; L3, the set generated by (1, 0) and (0, 2), is not closed but is 
dense. 

Every closure of a linear set is a closed set and the nonzero ideals, consid- 
ered as the linear subsets of a one-dimensional vector space, are dense sets. 
The only linear set which is both closed and dense is V,. 

The following theorem will be used later: 


THEOREM 3.1. The intersection of two closed sets is closed. The intersection 
of two dense sets is dense. The irredundant intersection of a closed set and a dense 
set is neither closed nor dense. (An intersection is called irredundant if none of 
the intersection components is superfluous.) 


This theorem follows immediately from the fact that 


Le} = Te]. 


DEFINITION 3.3. The essential ideal €=L/ V,, of a linear set L is the quotient 


ideal of the linear set and the whole space V,. The radical ©’ of L is the radical 
of &. 


The essential ideal € clearly consists of all the scalars which transform the 
whole space by means of scalar multiplication into the linear set. Therefore 
we have L/€=V,. The essential ideal of an ideal is the ideal itself and the 
radical of an ideal is just its ordinary radical. 


THEOREM 3.2. The following three statements are equivalent: L is dense; 
€ <0; L has maximal dimension n. 


Proof. If L is dense, the ideals L/(1, 0,--- , 0), L/(0, 1, 0,---,0),---, 
L/(0, - - - , 1) are nonzero ideals and therefore E=L/V,=[L/(1, 0, - - - , 0) 
(\---(\L/(0,---, 1)] is a nonzero ideal. If €0, there exists a scalar d 
such that the m vectors (A, 0, - - -, 0) -- - (0,---,A) lie in ZL and therefore L 


260 ERNST SNAPPER —~ [September 


has maximal dimension n. If L has maximal dimension n, there are n inde- 
pendent vectors fi, .--,f, in ZL. An arbitrary vector v may then be written 
as v=(a1/Bi)fit +(an/Bn)f, and therefore \v=pifit+ - - - lies in L. 
Consequently, L is dense. This proves the theorem. 

We conclude this section by remarking that LCL/aCLT and €,GL/MCR, 
where a has to be a nonzero ideal. 


IV. PRIMARY LINEAR SETS AND PRIME LINEAR SETS 


DEFINITION 4.1. A linear set L will be called primary if }v=0(L) implies 
either v=0(L) or X=0(C’). 


DEFINITION 4.2. A linear set will be called prime if the linear set is primary 
and its € is a prime ideal. 


It is clear that the ideals which are prime or primary according to these 
definitions are the ordinary prime and primary ideals. 


THEOREM 4.1. If L is primary € is primary and therefore &’ is prime. 


Proof. a8 =0(€) and 6 #0(€) imply L/a8= V, and L/B# V,. We can find 
therefore a vector v such that a6v=0(L) and 6v#0(L) from which it follows 
that a=0(@’), q.e.d. 

The converse does.not hold, since the set generated by the vector (2, 4) in 
the vector space whose scalar domain consists of the rational integers is not 
primary while its essential ideal is the zero ideal and therefore prime. 


THEOREM 4.2. A primary linear set is either closed or dense. 


Proof. If €~0, L is dense. If €=0, E’=0, from which we conclude that 
if \v=0(L) and v#0(L), d has to be zero. This means that L is closed. 

The converse does not hold since an ideal is a dense set but may not be 
primary. However, every closed set is a prime linear set since every closed 
linear set other than V, has a zero essential ideal. 

The following theorems will demonstrate further that a primary linear set 
is the exact analogue of a primary ideal. These theorems are proved in exactly 
the same way as in the case of ideals and are used constantly in the structural 
theory of linear sets. 


THEOREM 4.3. Let L be an arbitrary linear set, € its essential ideal and a an 
arbitrary ideal. Then L is primary and a is equal to the radical &’ of the linear 
set if and only if the following three conditions hold: (1) \v=0(L) implies either 
v=0(L) or N=0(a); (2) (3) NHO(a) implies =0(E) for some k. 


THEOREM 4.4. Let L be a primary linear set, a an ideal and M a linear set. 
Then aM =0O(L) implies either M=0(L) or a=0(@’). 


Coro.iary 4.1. If L is primary and a40(G’), L/a=L. 


4 

a 

1) 


1942] LINEAR SETS : 261 


4.5. If - - we have € = 
and @’= -- - OG! ], where ©; and are the essential ideal and 
radical of 


V. THE NOETHER DECOMPOSITION THEORY 


A linear set will of course be called reducible if it is the irredundant inter- 
section of two other linear sets and irreducible if it is not reducible. 
From Theorem 1.2, the following theorem is derived: 


THEOREM 5.1. Every linear set is the irredundant intersection of a finite num- 
ber of irreducible linear sets. 


After each of the following theorems we have indicated how to derive the 
proofs of these theorems from the very similar proofs of the corresponding 
theorems on ideals, given in [1]. 


THEOREM 5.2. Every irreducible linear set is primary. 


Proof. (See [1, pp. 31 and 32].) Replace the radical of the ideal by the 
radical of the linear set and the integral domain by the vector space. 
An immediate corollary of Theorems 5.1 and 5.2 is: 


THEOREM 5.3. Every linear set is the irredundant intersection of a finite num- 
ber of primary linear sets. 


THEOREM 5.4. The intersection of a finite number of primary linear sets 
which all have the same radical is a primary linear set with that same radical. 


Proof. (See [1, pp. 32 and 33].) Use Theorem 4.3. 


By replacing the intersection components which have the same radical by 
their intersection we prove: 


THEOREM 5.5. Every linear set is the irredundant intersection of a finite num- 
ber of primary linear sets whose radicals are all different. 


A representation of a linear set as described in Theorem 5.5 is called “a 
representation by maximal primary components.” 

As a preparation for the first uniqueness theorem we need the following 
lemma: 


‘LEMMA 5.1. In two representations of a linear set by maximal primary com- 
ponents, each radical which is maximal among all the radicals of both the repre- 
sentations is present in both the representations. 


Proof. (See [1, pp. 35, 36].) Divide the components of both the representa- 
tions by the essential ideal of a primary component whose radical is maximal 
among all the radicals of both the representations. Use Corollary 4.1. 

From this lemma. the following theorem is easily derived: 


262 ERNST SNAPPER [September 


THEOREM 5.6. An irredundant intersection of a finite number of primary 
linear sets, whose radicals are not all the same, is not primary. 


This theorem shows that in a representation of a linear set by maximal 
primary components no group of components has a primary intersection. 


THEOREM 5.7 (The first uniqueness theorem). In two representations of a 
linear set by maximal primary components, the number of components is the same 
and so are the radicals of the components of the two representations. 


Proof. (See [1, p. 36].) Single out, from the two representations, two linear 
sets L,; and L2 whose radicals €/ and &/ are the same and are maximal among 
all the radicals of both the representations. Then divide both the representa- 
tions by the intersection of the two essential ideals of Li and Le. 

The uniquely determined radicals which occur in a representation of a 
linear set by maximal primary components are prime ideals (Theorem 4.1). 
They will be called the “associated prime ideals” of the linear set. 

The following theorems are essential for the proof of the second unique- 
ness theorem: 


THEOREM 5.8. We have M/€,=M if and only if the essential ideal €, of the 
linear set L is not contained in any of the associated prime ideals of the linear 
set M. 


Proof. (See [1, p. 37].) Divide by G, and use the fact that €:N2C [Nie], 
where N; and N, are linear sets and &;, is the essential ideal of Ni. 
Another way of stating this theorem is: 


THEOREM 5.9. We have M/€,=M, where €, is the essential ideal of L, if 
and only if no associated prime ideal of L is contained in an associated prime 
ideal of M. 


This theorem leads to: 


DEFINITION 5.1. The linear set L is relatively prime with respect to the linear 
set M if and only if M/E, =M. 


The following definitions lead to the second uniqueness theorem: 


DEFINITION 5.2. An associated prime ideal of a linear set is said to be im- 
bedded if it contains another associated prime ideal of that linear set. 


DEFINITION 5.3. An intersection of maximal primary components, all belong- 
ing to the same representation of a linear set L by maximal primary components, 
is called a component linear set of L. 


DEFINITION 5.4. Two component linear sets, which are taken from the same 
representation of a linear set L by maximal primary components and which have 
intersection L, are called conjugate component linear sets of L. 


1942] LINEAR SETS 263 


If we call the conjugate of L, considered as a component linear set of it- 
self, V,, we have: Every component linear set of a linear set has a conjugate, 
which is not necessarily unique. 


DEFINITION 5.5. A component linear set of a linear set 1s said to be isolated 
if none of its associated prime ideals contains an associated prime ideal of its 
conjugate. 


THEOREM 5.10 (The second uniqueness theorem). An isolated component 
linear set of a linear set is uniquely determined by its associated prime ideals. 


Proof. (See [1, p. 38].) Use Theorem 5.9. If in [1] the author divides by 
an ideal, here we have to divide by the essential ideal of the corresponding 
linear set. 


VI. Two CONSEQUENCES OF THE STRUCTURE THEORY 


It might be possible that the associated prime ideals of a linear set were 
always the same as the associated prime ideals of the essential ideal of that 
linear set. This would suggest that the representation of the linear set by 
maximal primary components were induced by the Noether decomposition 
of the essential ideal of the linear set. However, this conjecture will be dis- 
proved by the following counterexample: 

EXAMPLE 6.1. Let V2 have the polynomials in two variables x and y with 
integral coefficients ‘as scalar domain. Let L; be the linear set consisting of all 
the vectors whose two components have a difference congruent to zero modulo 
(x), and Lz the linear set consisting of the vectors whose two components 
have a sum congruent to zero modulo (x?, y). Consider finally the linear set 
L=[LiA\L2]. The following statements can easily be verified: 

is a primary linear set. (x?) and €/ = (x). 

L, is a primary linear set. €,=(x?, y) and G/ =(x, y). 

L=[LiN\ Li] is a representation of L by maximal primary components. 
The associated prime ideals of L are (x) and (x, y). However, the essential 
ideal of L is (x*) and its only associated prime ideal is (x). 

This example illustrates the general situation described in the following 
theorem: 


THEOREM 6.1. The associated prime ideals of the essential ideal of a linear 
set are among the associated prime ideals of that linear set. 


Proof. From Theorem 4.5 we see that the representation of a linear set by 
maximal primary components induces a decomposition of the essential ideal 
into primary ideals, whose radicals are all different and equal to the associ- 
ated primes of the linear set. However, this decomposition may not be irre- 
dundant, as Example 6.1 shows, where this induced decomposition is 
(x) = [(x?)\(x*, y)]. By deleting the unnecessary components in the inter- 


264 ERNST SNAPPER 


section, we get the Noether decomposition of the essential ideal and the re- 
maining radicals are the associated primes of the essential ideal. 


THEOREM 6.2. Every linear set is the irredundant intersection of a closed set 
and a dense set. The closed part of the intersection is unique and is the closure 
of the set. The dense part of the intersection is not unique, even when the inter- 
sections are restricted to representations by maximal primary components. The 
dense part of the intersection is not present if and only if the linear set is closed, 
and the closed part of the intersection 1s not present if and only if the set is dense. 


Proof. Consider a representation of the linear set by maximal primary 
components. From Theorems 3.1 and 3.2 it follows that the intersection of 
the intersection components whose radicals differ from zero is a dense set and 
that the intersection component whose radical is the zero ideal is a closed set. 
This dense set and this closed set clearly satisfy the requirements of the theo- 
rem. That the closed part of the intersection is the closure of the linear set 
is proved as follows: Let L= [CD] where D is dense and C is closed. Then 
L=[CA\D] (Part III) and C=C and D=V,,. Therefore L=C. The last part 
of the theorem follows immediately from Theorem 3.1. Finally, the following 
example shows that the dense part of the intersection is not unique, even if 
we restrict ourselves to representations by maximal primary components: 

EXAMPLE 6.2. Let V2 have the rational integers as scalar domain. Let L be 
the linear set generated by the vector (2, 2). The following statements can 


easily be verified: The linear set generated by the vector (1, 1) is the closure LZ 
of L. L is the intersection of Z and the dense set generated by the vectors (2, 2) 
and (1, 3). L is also the intersection of Z and the dense set generated by the 
vectors (2, 2) and (2, 4). These are two representations of L by maximal 
primary components and the two dense parts of the two intersections are dif- 
ferent. 


REFERENCE 
1. B. L. van der Waerden, Moderne Algebra, vol. 2, Berlin, 1940. 


PRINCETON UNIVERSITY, 
PrincETON, N. J. 


ON A GENERALIZATION OF THE PROBLEM 
OF QUASI-ANALYTICITY 


BY 
S. MANDELBROJT AND F. E. ULRICH 


Introduction. The class of all functions analytic in the closed interval 
ax 3), may be characterized in the following manner: It is the class of all 
functions f(x) defined and infinitely differentiable (indicated hereafter as i.d. 
functions) in the interval such that to each function there corresponds a posi- 
tive constant k=k(f) with the property [5](#) that 


| f(x) | < kml, 


Gevray(?), in studying the heat equation, introduced functions ®(x) such 
that 


| (x) | < k"(2n)!, nz=1, 
or more generally 
| | < k*T(an), n = 1, 


where a is a constant greater than unity. These functions are in general not 
analytic. In fact, it is possible to construct a function [5] satisfying this last 
inequality in a closed bounded interval, and such that the function and all 
its derivatives are zero at one point of the interval, but which is not identi- 
cally zero in the interval. 

In the work to follow we shall denote by J either a closed bounded interval 
[a, b], or an infinite interval of one of the three forms: (— ~, b], (—, @), 
[a, ©). If {M,} is a sequence of positive constants, we shall denote by Crm.) 
the class of all functions f(x) i.d. in an interval J and such that to each func- 
tion there corresponds a positive constant k with the property that in J 


| f(x) | < 


The class of analytic functions and the two classes of Gevray mentioned 
above, each defined in a closed bounded interval, are respectively classes 
Cnty» Ad 

The class C;,;, has the property that each function belonging to it such 
that the function and its derivatives of all orders are zero at one point of the 
interval is identically zero. Or what is equivalent, two functions each belong- 


Presented to the Society, September 10, 1942; received by the editors August 15, 1941. 
(‘) Numbers in brackets refer to the references listed in the bibliography given at the end 


of the paper. 
(?) See, for instance, S. Mandelbrojt, Rice Institute Pamphlet, vol. 29, 1942. 


265 


266 S. MANDELBROJT AND F. E. ULRICH [September 


ing to this class, and having the same value and the same derivatives of all 
orders at one point of the interval are identically equal in the interval. A 
class C,y,, having this property is called quasi-analytic. Thus the classes of 
Gevray are not quasi-analytic. 

The physical character of Gevray’s work indicates the importance of 
knowing necessary and sufficient conditions on the sequence { M,} in order 
that the class C;y,, will be quasi-analytic. This problem was proposed by 
Hadamard(*) in 1912. Denjoy [3] gave a sufficient condition and Carleman 
[2] generalized Denjoy’s condition and gave the complete answer to Hada- 
mard’s question. The following is Carleman’s theorem in a form given by 
Ostrowski [8]: 

Let T(r) =1.u.b.n21(7"/M,). A necessary and sufficient condition that the 
class C,y,) will be quasi-analytic is that T(r)dr/r? = 0. 

If I is the interval 0Sx 2m and Cfy,, is the class of functions even and 
periodic with period 27 and belonging to C;y,) in J, the class C,y,; and the 
class Cfy,) are quasi-analytic at the same time. In other words, if there exists 
a function f(x) not identically zero belonging to C;y,) in J such that the func- 
tion and all its derivatives are zero at a point of J, then there exists a function 
(x) not identically zero belonging to Cfy,; such that the function and all its 
derivatives are zero at a point of J. 

It is known [4, 5] that if there exists in [0, 1] a function f(x) not identi- 
cally zero and belonging to C;y,), with f‘”(0)=0 (m20), then there exists 
in the same interval a function ®(x) not identically zero and belonging to 
With &(”(0) = =0 (n20), and such that ®(x) =@(1—x). Asa 
matter of fact, if Cy, is not quasi-analytic, there exists a function ®(x) not 
identically zero in [0, 1] having the prescribed properties and such that 
| &(»(x)| <M, (n21), [5]. Thus on writing (x) =®(x/2r) and continuing 
this function by periodicity with period 27, one sees immediately that if 
Cim,) is not quasi-analytic, there exists a function ¥(x) belonging to Cfy,, 
and not identically zero, such that py‘ (0) =y‘” (27) =0 (m 20) and such that 
|y~(x)| <M, (n21), x in [0, 2x]; that is, for which-the constant & is unity. 

It can be shown that if Cfy,; is not quasi-analytic then there exists a 
function F(s)=).*_,dm/m*, s=o+it, where this series converges absolutely 
in all the plane (04 =abscissa of absolute convergence = — ©), the function 
F(s) having the following properties : 

(i) If then M(—n) SM , n=1,2,---. 

(ii) F(—2n) =0 for n>n)>0. 

(iii) F(s) is not identically zero. 

Indeed, if Cfy,,; is not quasi-analytic, there exists an i.d. function 
(x) =) "_odm Cos mx, not identically zero, with (0) = =0 (n 20) 
and such that | (”)(x)| <M, (n21). We then have 


(*) J. Hadamard in a communication to Société Mathématique de France (Comptes Rendus 
des Séances de la Société Mathématique de France, 1912, p. 28). 


QUASI-ANALYTICITY 


= (— = 0, 


1 
d. = —f ®(x) cos mx dx, 
and so after g integrations by parts we have 


1 al COS Mx 
f (x) dz, 
rm? Jo 


sin mx 
so that 


aol 


Therefore(*) 


2 2 
|dn| on 
(m*/M,) T(m) 


Now put am=dm/2cm? (m21) where c=)_"_,1/m?. Then the function 
F(s) =) has the desired In the first 


dm 


But 7(m)=m*/M,, and so 


= 
C m=x1 


From (1) it follows that, for 22: 


n—1 
2¢ m=1 
Finally, since ®(x) is not identically zero, and since ®(0) =0, the d», (m 21) 
are not all zero and, all the a,, are not zero, so that F(s) is not identically zero. 
The problem of finding conditions on the sequence {M,} in order that 
the function F(s) =)_"_,am/m® will be zero when F(—2n) =0 (n>mo>0) and 
> am | m” < M,, is analogous to, but technically very different from, prob- 
lems considered by Carlson, VI. Bernstein [1] and others, concerning entire 
functions f(z), z=x-+iy, or functions holomorphic in an angle, and having the 
following properties (considering only entire functions) : If m(r) is a function 
of r increasing to infinity with 1, |f(z)| <m/(r) for |z| =r and f(m) =0 (21). 
What then are conditions on the function m(r) in order that f(z) will be identi- 
cally zero? This problem was solved in many circumstances [1], and the solu- 
tions show that the requirement f(m) =0 can be replaced by f(v,) =0, if the 


(*) Such inequalities were first used by Mandelbrojt. See [4 or 5]. 


1942] 267 
But 


268 S. MANDELBROJT AND F. E. ULRICH [September 


|_| increase in a suitable manner. In the light of these results it seems natu- 
ral to suppose that with the same conditions on the sequence { M,} in the 
problem concerning the series }>*1a»,/m*, F(s) will be identically zero if we 
suppose F(—v,) =0, the v, being real and tending toward infinity, with suit- 
able hypotheses on the mode of increase of the vz. 

It seems reasonable then to expect that if the sequence { M,} satisfies the 
conditions of quasi-analyticity, or a condition analogous to these, a function 
belonging to Cfiy,, will be zero when the function and its derivatives of orders 
VY, are zero at a point, provided the integers v, increase in a suitable manner. 

It is important to notice that while the conditions on the sequence { M,} 
are the same in order that a function in either the class C;y,, or the class Cy, 
will be zero when it and its derivatives of al/ orders are zero at a point, it is 
impossible to extend to the class C;y,, the notion of quasi-analyticity when 
only those derivatives of certain orders are zero at a point. The very simple 
example of the function f(x) =x, which belongs to every class and which has 
derivatives of orders n 22 zero, proves it. This explains why we speak above 
only of the class Cfy,). Nevertheless, it will also be possible to extend our 
results to non-periodic functions of a class C;y,): if the function behaves in a 
certain manner at infinity. 

In a word, we are going to give conditions on the sequence { M,} in order 
that a function of Cfy,) will be identically zero knowing only that a suitable 
partial sequence of the derivatives of the function, along with the function 
itself, is zero at a point. In addition, we shall give conditions on the sequence 
{ M,} in order that a function f(x) of Ciy,; in [0, ©), such that fO|f(x)|dx 
< , will be identically zero, again knowing only that a suitable partial se- 
quence of its derivatives, along with the function itself, is zero at a point. 

If we still suppose that a suitable partial sequence of its derivatives, 
along with the function itself, is zero at a point, but we no longer suppose 
that | f(x) |dx< we shall prove that if lim supa... | f™(0)|/"< then 
from our conditions on { M,} it follows only that lima... |f‘(0)|“"=0. 

If the sequence of the orders of the derivatives supposed zero at a point 
is { vm } , the distribution of the integers v,, will be characterized by the quan- 
tity G=lim supm.. (¥m/m). In all cases our results will be valid if only G<2. 


Part I 


1. It is well known that the classical problem of the quasi-analyticity of 
Cim,) is equivalent to the problem of Watson (see [2, 4, 8]); that is, the prob- 
lem of determining necessary and sufficient conditions on the sequence { M a 
in order that a function F(z), holomorphic in a half-plane (say R(z) > 1(°)) and 
satisfying the inequality | F(s)|<M,/|s|*, will be identically zero. Our 
method consists in the very generalization of Watson’s problem and the 
solution of the problem thus generalized. 


(*) R(s) denotes the real part of z. 


1942] QUASI-ANALYTICITY 269 


We shall define analytic functions Fi(z) and F(z) as follows: If f(x) be- 


longs to Cfy,), 


F,(z) = S(x)e-**dx. 
0 


If f(x) belongs to Cim,), Where the interval of definition is taken to be 
0Sx< ©, and f(x) is such that ff |f(x)|dx< ©, we define 


F,(z) = f f(x)e~**dx. 
0 


The function F,(z) is an entire function and F(z) is holomorphic in the 
right half-plane R(z)>0, and bounded there. That is, if 2 is any point of 
R(z) >0, | F,(s)| is less than the constant [> | f(x)| dx. If we integrate m times 
by parts, we have 


n—1 f{(m) 
(2) = (1 om >> f (0) 
m=0 


f(m) 
(3) F,(z) = =f (a)e-**dx, R(z) > 0. 


1 


In the classical case, where all derivatives of f(x) are supposed zero at the 
origin, the sums appearing in (2) and (3) are zero, which enables one to pass 
directly to Watson’s problem. In that case, for any a>0, in the right half- 
plane R(z) 2a both | F,(z)| and | F:(z)| are less than ck"M,/|2|*, where c 
is a constant and k depends only on f(x). This is not true, however, in the case 
when it is only known that a suitable partial sequence of the derivatives of f(x) 
vanish at the origin. 

Along with the hypothesis that the derivatives of f(x) of orders vy, are zero 
at the origin, we shall first assume that lim sup,.. |f‘"(0)|1/"< ©. This re- 
striction will be removed later. : 

While, from what was stated in the introduction, it seemed natural to 
conceive of our problem for the class Cfy,,, composed of all even periodic 
functions belonging to C,y,}, and this problem certainly does not have mean- 
ing for all functions of this latter class, it will be expedient to begin our study 
for those functions of C,y,, for which such a study is possible without suppos- 
ing periodicity. The periodic case will be treated in Part II. 

Let us first prove the following simple lemma: 


Lemma I. Let {v,} be a sequence of positive integers, and {n} the sequence 
of integers complementary to the sequence {vn} with respect to the non-negative 
integers. If 


(a) tim inf 2G’ > 1, 
n 


270 S. MANDELBROJT AND F. E. ULRICH [September 


then 


Vn 
b lim 
(b) 


Conversely, from (b), with G’>1, follows (a). 


Denote by M(#) the number of integers v, $?, and by N(t) the number of 
integers A, St. If then, for 
An t An+1 


n  N(t) n 


from which it follows that lim inf,.,. (¢/N(¢)) =lim inf,... (A,/m). Thus from (a) 
wehave that lim sup;...(N(t)/t) $1/G’. Onthe otherhand, M(t)+N(t) = [¢]+1, 
where [¢] is the greatest integer contained in ¢. But from this it follows that 
M(t Nii 
lim inf + = 1, 
t tow t 

Since lim sup:.. (N(t)/t)S1/G’, we have that lim inf;... (M(t)/t)21 
—(1/G’) =(G’—1)/G’, and (b) follows from this. 

The converse also follows immediately from the above considerations. 


2. THEOREM I. Suppose f(x) belongs to C,y,, in the interval OSx< © and 
is such that Jy | f(x)|dx< 0. Moreover, suppose lim supa... | f™(0)| Un< wo and 
f°™ (0) =0 (m21). If lim supm.c (vm/m) <2, and if {Plog T(r)dr/r?= @, 
where T(r) =1.u.b.nz1 (r"/ Mz), then f(x) is identically zero. 


If we denote by R (< @) the number lim sup,.. and let 
Ps=R+8, where B is any positive number, the series }>"_,f™(0)/z"+? con- 
verges uniformly in |z| =P, to a holomorphic function ®(z). Let y be any 
positive number and denote by D(8, y) the region common to | z| =Pz, and 
R(z) =v. Then in D(6, y), from (3) we have 


Since lim supn+« | f™(0)| n= R, there is a constant P >max (R, 1) such 
that |f‘~(0)| <P* for all 21. Now let 6 be chosen so that Ps>P>1. Then 
in |z|=> Pz 


= — = 


Pp. 2 pm 1 pr 


(5) 


where x: is a constant. 
On the other hand, f(x) belongs to C,y,; in OSx<@ and so | f(x) | 


= 
= 
4 


1942] QUASI-ANALYTICITY 271 


<k*M, (n21) for 0Sx< ©, where k depends only on f(x), from which it 
follows that 

1 k"M, 

—f (x)e-**dx | < | dx. 

2" J Jo 
Hence for all z in R(z) 27, 


(6) = < 
0 


where xz is a constant. 
From (4), (5) and (6), it results that in D(8, +) 


+ xok"M,, 


If lim inf,... (M,)"">0, there is a constant c such that 


| Fx(2) — (2) | < 


c"M, 


for all z in D(8, y). But since fflog T(r) dr/r?= ©, by Carleman’s solution 
of Watson’s problem [4, 8], F2(z)=(z) in D(8, +). 
If lim inf,... (M,)"=0, to any positive constant g (< ~) there corre- 


| — < 


sponds a sequence {n;} of integers such that Mj’ <g (¢21). Therefore in 


7) 


P™ + 


| — &(z)| < 
| 


where Q (< @) is a constant. If we denote by {M, } the sequence such that 
Mi,,=Q" (421) and M, =x,P*+«k*M,, m not equal to any then in 
Dé, 7) 


n 


Now if we let 71(r) =l.u.b.,21 (7*/M,/), then = © for r>Q. For, if 
r>Q, we have 


| — < 


Ti(r) = lL.u.b. = l.u.b. (r/Q)"* = 


iZ1 
Therefore fflog T1(r) dr/r?= and again F,(z) = ®(z) in D(B, ). 
Thus, under the hypotheses of the theorem, we have in all cases that 


F,(z) = ®(z) in D(6, y) and hence in the entire region of definition of the func- 
tions. 


= 221, 
| 


272 S. MANDELBROJT AND F. E. ULRICH [September 


If we let = F(1/£), is holomorphic in R(¢) >0, and in |¢| <1/R, 


(7) vit) = fem = ont’, 


where {Xm} is the sequence of integers complementary to the sequence {rm} 
with respect to the non-negative integers. By hypotheses [> | f(x)|dx< © and 
so | F2(z)| Jo |f(x)|dx for all in R(z)>0. Therefore we have that és 
holomorphic and bounded in R(f) >0 and is given by (7) in |¢| <1/R. 

The conclusion of the theorem is now obtained from the following fact, 
which for later reference will be stated as 


Lemma II. Let the function F(z) be defined by the series > \"_,am2”™ in 
|z| <p and let G’ =lim infno (Am/m). If F(2) is not identically a constant, in 
each closed angle with vertex at the origin and with opening 24/G’, one of the two 
following possibilities must exist. Either 

(i) F(z) has a singular point, or 

(ii) F(z) is not bounded. 


aa) result is a part of a theorem of Mandelbrojt. For the proof see [6’ 
p. 15 ff.}. 
Since {vm} and {X»} are complementary sequences with respect to the 
non-negative integers, and since lim supm.. (Y¥m/m) <2, it follows from Lemma 
I that G’ >2. Hence thefe exist closed angles in the right half-plane with ver- 
tex at the origin and with opening 27/G’. Therefore ¥({) must be identically 
a constant since neither (i) nor (ii) is true for ¥(¢) in the right half- 
plane. Moreover, since ¥(0)=0, this constant must be zero. Therefore 
Om =f (0) =0 for all m 21. Thus all derivatives of f(x), along with f(x) itself, 
are zero at x =0 since f‘’™) (0) =0, m21. Then, since {flog T(r) dr/r?= @, the 
class C;y,) is quasi-analytic and f(x) is identically zero. 

3. If the restriction fy | f(x) |dx< © is removed, the following result can 
be obtained: 


THEOREM II. Suppose f(x) belongs to Ciy,, in OSx<@ and (0) =0’ 
m=1. Moreover, suppose lim Un< Tf lim (¥m/m) <2’ 
and if [Plog T(r) dr/r*= ©, then lima. 


In order to prove this theorem we shall use the following lemmas. 


Lemma III. Let 0(s) be defined by the Dirichlet series >*_,dme*™’, where 
lim infmoo Hm/m2L>0. Let be a positive quantity and If 0(s) is 
holomorphic in the circle |s—s:| and if |0(s)| <M in this circle 


(*) A function represented by a Dirichlet series, which, of course, we suppose always to 
have an axis of convergence, will be said to be holomorphic at a point 70+, if it is possible to 
continue analytically the function given by the sum of the series along the line ¢=% through the 
point oo+t. The value 0(c0+-tto) is the value given by this continuation. 


1942] QUASI-ANALYTICITY 
then for eachj=1,2,---, 

| ds|| gXud | mm < KM, 
where K, is independent of j,0,and M, and 


If 6(s) is bounded in the whole strip |t| <#/2, and if L>2, since u>0 
it is then evident that d;=0 for every j=1,2,---. 

This lemma was proved by Mandelbrojt [6, p. 14 ff.] and was used by 
him in the proof of the theorem which is given in the present paper in a modi- 
fied form as Lemma II. Lemma III was also proved for entire functions by 
Mandelbrojt and Gergen [7]. 

From Lemma III results the following lemma which was given by Mandel- 
brojt in his lectures at Rice Institute and for which we shall now give the 
proof: 


Lemma IV. Let F(z) be given in the neighborhood of the origin by the Taylor 
series with lim infn.. (m/m) =D>L. If F(z) is holomorphic in the 
sector, | arg s| <a/L, then lim supm.. | 28 /p(8). 


If we put z=e~*, the function ®(s) = F(e~*) is holomorphic for values of 
s=o+it such that o> —log p—z/L, | </L, and in a half-plane o>c, ®(s) 
is given by the series )\*_,ame7*™. 

There exists therefore a constant ¢€>0 such that ®(s) is holomorphic in 


the circle |s—so| S(x/D)(1+€), where so= —log p. Hence by Lemma III we 
have 


(9) | a;|| g(us)| < KMo™, 
where g,(z) is again the function defined in (8) and M=max | &(z)| in 
|s—so| $(#/D)(1+e). 
We now evaluate g,(u,). Since the yw», are increasing positive integers. 


Mite + k>0O, 
and 


S — OSk <j. 
Therefore 


(7) We suppose 4; >0, and as usual, to pn increasing to infinity. 
(8) We again suppose 4; >0. 


S. MANDELBROJT AND F. E. ULRICH [September 


2 


n=l p=—i+1, pd (us + p)? 


=n, 


map j—i+1 


ma 1, m= 


(10) 


But since 
2 
2 


we see that the left member of this last equality is equal to —y; cos mpy,;/2 
On the other hand, 


m 


J)! ws [os — DN! 
(2u;-7j)! My! j j 

— J)! (us — 

< = 23«i-7, 
Mi 


From (10) we then have that 
2. 
| | = (us /f)2 


From (9) it then follows that 


and from this the statement of the lemma results immediately. 

We are now ready to pass to the proof of Theorem II. It should be noticed 
that even under the present hypotheses, the function F2(z) defined by (3) 
exists for each z in R(z) >0 and is holomorphic there. For, 


(*) If 4;=/, the quantity in the second brackets must be taken equal to unity wherever it 
appears. 


274 
— 


QUASI-ANALYTICITY 


f(x) = f(0) + 


and>|f’(x)| <kMi in so that |f(x)| S| f(0)|+eMix, 
Therefore /> f(x)e-**dx converges uniformly in R(z) 2a >0 and defines a holo- 
morphic function there. But under the present hypotheses it can no longer 
be said that F2(z) is bounded in R(z) >0 and so it is not possible to draw the 
same conclusions as in Theorem I. However, the function ({) = F.(1/f) is 
holomorphic in R(¢)>0 and is given by the series (if? (0)0* in 
| <1/R where R=lim sup,.. (0) | By Lemma I, lim inf, ..(Am/m) >2 
since lim suPm.. (¥m/m) <2. Hence we can apply Lemma IV with L =2 to the 
function ¥(f). The sequence {ym} is replaced by the sequence {Xn+1}, 
am =f%~)(0) and p can be any positive number. Therefore 


lim sup | f9=)(0) | < 23/p, 


and since p is any positive number 


lim | = lim | f%(0) | = 0. 


4. We now pass to the case in which it is not assumed that 
lim supPa-« | f(0)| in< oo. In this case the radius of convergence of the 
series )~_1f™)(0)¢*=+! may be zero and then the series no longer represents 
¥(¢). But we shall see that due to the relation which exists between this series 
and the function, it is possible to consider the series as somewhat analogous 
to an asymptotic development in order to obtain some properties of ¥/(f). 
It is useful first to establish some preliminary lemmas. 


Lemma V. If ®(z) is not identically zero and is holomorphic in | s| S1 except 
possibly at z= —1, and is bounded in this circle, then 


f log | | > — 


This lemma is well known. A proof is given by Ostrowski [8]. 
Let {N,} be a sequence of positive numbers and {y,} a sequence of posi- 
tive numbers increasing to infinity. If w is any positive number, we define 


TAf) = 1.U.D. 


Lemma VI. Let {N,} be a sequence of positive numbers and {u»} a sequence 
of positive numbers increasing to infinity. Suppose A(£) is holomorphic and 
bounded in R(t) 20, and 


1942] 275 


276 S. MANDELBROJT AND F. E. ULRICH [September 


where k is a constant. If [Plog r4(r)dr/r? = , then =0. 


(11) | A()| n= 1, in = 0, 


From (11) we have 


0, 


1 1 
A 


and hence 
(12) log | A()| S — log ru(| /2). 


Now let £=(1—2)/(1+2) and B(z) =A[(1—z)/(1+2)]. B(z) is then holo- 
morphic in |z| $1 except possibly at z= —1, and is bounded in this circle. 
If A(£) were not identically zero, B(z) would not be identically zero and by 
Lemma V we would have 


1— 
| B(e#) | do = J A d>— ~, 


It would then follow from (12) that 
“\k| 1 + + 
de 


But on letting (1/) | (1—e#)/(1-+e)| =(1/k) tan (6/2) =r we see that the 
first integral is equal to 
Qk * log ro(r) dr, 
o 1+ 77k? 


and since lim,.., [r?/(1+7°k*)] =1/k?, we would have 


f 
1 


which is contrary to hypotheses. Hence it must be true that A(£) =0. 

We shall now prove the following lemma concerning series which are 
analogous to an asymptotic series. This lemma is a generalization of the theo- 
rem of Mandelbrojt which is given in the present paper as Lemma III. 


Lemma VII. If the sequence {d,} is such that there exist a sequence {N,} 
of positive numbers, a sequence {tn} of increasing positive numbers with 
lim inf,... =G’ >2,a positive quantity w<1—2/G’, the two sequences and 


1942] QUASI-ANALYTICITY 277 


the quantity w related by {flog r.(r)dr/r? = ©, and a function O(s), s=o+it, 
holomorphic and bounded for | t| <m/2, satisfying for every 0<8 <1 the inequali- 
ties 


where L; and o; depend only on 6, then d,=0 (n21). 


The proof has some points in common with the proof of the theorem of 
Mandelbrojt given as Lemma III. The function g,(z) given by (8) is entire 
and odd. Hence its expansion in powers of z contains only odd powers: 


gi(2) = 
p=0 
In the paper by Mandelbrojt and Gergen [7], it was proved that 


where A, depends only on e. 
Define the function 6;(s) as follows: 


6,(s) = 8): 


Let e>Oand a positive that w’ =1—(2/G’)(1+¢«) —5>w>0. 
We shall now prove that the series defining 4;,(s) converges uniformly and abso- 
lutely in the strip S,:|t| <w(w’+8)/2. If s is any point of S., each point u of 
the circumference y: u—s| =72(1+€)/G’ is in the strip |¢| <2/2. Then by 
Cauchy’s integral formula, 


(2p +1)! | || du| (2p + 1)1M 


where M is the bound on @(s) in |¢| </2. Therefore from this inequality 
and (13) with € replaced by €/2, we have that in S, 


1 + 


| 


and so the series defining 0;(s) converges uniformly and absolutely. Moreover 
it also follows from this last inequality that 


1 2\2e+1 
(14) |e(s)| = = K.M, 


for each s in S., where K, depends only on e. 


278 S. MANDELBROJT AND F. E, ULRICH [September 


If we let 
H,(s) = 0(s) — 


m=1 


we have by hypothesis that in | ¢| <m(1—58)/2,0><03, 
| Hn(s) | < n2=1. 


Let S.,s be the half-strip defined by | ¢| <rw’/2, >os+2(1+6)/G’ =o4. If s 
is any point of the half-strip S.,s, each point of the circumference y will be a 
point of the half-strip | ¢| <a(1—6)/2, ¢>0;3. Then by the Cauchy integral 
formula for (s) we have that in S,, 


(2p + 1) 
+ 
This inequality along with (13) yields that in S,,; 


< 
m=1 


n Pe 1 2ptl 
mal 


Now consider the series 
Ech (5) +¥ |, 
p=0 m=1 


By (15) this series converges uniformly and absolutely in S,,; and in this — 
half-strip is equal to 


p=0 m=1 


But since g;(u,)=0 for mj, if n2j, this last expression becomes 6;,(s) 
+d gj(u;)e~*s*, and again from (15) it follows that in S,,s 


1+ 
i+e 


2ptl 
| 6(s) + dig (uje** | s Ac pl ) = L;Ki N,€ 
p=0 
n2j, 


where K/ depends only on «. 
Let 7=m+na=s/w’. Then if we let =0,nw’), we have that in 
/w’, 


| En) + dig < n = j. 


Now let =e" and denote by A,(£) the function corresponding by this 
transformation to E;(n) +d g;(u;)e-““"". We then have in the domain defined 
by R(é) >0, | since | £| 


4 


QUASI-ANALYTICITY 


LiKiN, 
But then there exists a constant k>1 such that for every 1S"<j 
LiKi 
| 

in >0, | Thus we have that 
LsKi N, 
| 


| | < 


’ n= j. 


| A,é)| < 


| A,()| < n= 1, 


in >0, | 
Now let 
A(t) = + £)/LiKi, 


where a=e"*/“ +1. A(£) is holomorphic in R(£) 20 and from (14) it follows 
that it is bounded in this closed half-plane. From the above inequality we 
have that 


ke'™N, 
LiKi ~ | 


in R(E) 20, since a is a real positive quantity. . 

Since w’ >w, if [flog r.(r)dr/r?= ©, then {flog ru(r)dr/r?= ©. Thus all 
the conditions of Lemma VI are satisfied by the function A(£) and the se- 
quences {u,} and {N,}, where the w of that lemma becomes the present w’. 
Therefore A (£) =0, and we have that A ,(£) =0. From this it follows that 


= — dig(uje*. 
From (14) we then have that 
| ds|| gus) | < MK. 


| A@®| = 


n = 1, 


for all ¢. And since this inequality holds for all o, and since 4: >0, it follows 
that d;=0 for j7=1, 2, --- . With this the lemma is proved. 

Let there be given a sequence { M,} of positive numbers, a sequence {v,} 
of positive integers and a positive constant w. We shall define 


T(r) = Lub. 


1 


where {X,} is the sequence of positive integers complementary to the se- 
quence {vn} with respect to the non-negative integers. 


THEOREM III. Suppose f(x) belongs to C,y,, in OSx< © and is such that 
\f(x)|dx< ©. Moreover, suppose f°” (0) =0(n21). If G=lim sup,... (vn/n) 


1942] 279 


280 - S. MANDELBROJT AND F. E: ULRICH ‘ [September 


<2, and if there exists a positive constant w<2/G—1 such that fy log T.(r)dr/r? 
= 0, then f(x) is identically zero. 


We shall again consider F:(z) defined by (3), but now in the half-plane 
R(z) =a>0. We then have in this veg 


n—1 (m)(Q 


| Fa(z) — 


Since f(x) belongs to C;y,) in 0sx< ~~, it follows that 


n—1 f(m) (0 k*M, 
Fate) -> < n= 1, 


m=0 


in R(z) 2a, where c is a constant. 
If we let z=e*, s=o+it, and V(s) = F,(e*), on recalling that =0 
(n 21), the last inequality becomes 


(16) | V(s) — | < be 
m=1 

in D., where D, is the image of R(z) 2a by the transformation z=e*. Here 
{X,} is the sequence of integers complementary to the sequence {»,} with 
respect to the non-negative integers. Let G’=lim inf,.. (An/m). If we let 
Mn=An+1, it is evident that G’=lim inf,.,. (u,/m). Since by hypotheses 
G<2, we have by Lemma I that G’>2. 

To each 0<6<1, there corresponds a number ¢; such that each point of 
the half-strip <m(1—6)/2, is a point of D.. Therefore from (16) we 
have that in this half-strip 


V(s) — | < Nie", n= 1, 
mal 
where 
We also have © 


Ontl) ye 


Lu.b,. ———- = — Lu.b. 
Nn n21 My ck n21 My,41k* 


From this it is then evident that the divergence of {flog 7.(r)dr/r* follows 
from the divergence of {flog T.(r)dr/r?. 

Thus Y(s) satisfies all the conditions on the function 0(s) in Lemma VII, 
where f®»)(0) =d,. Hence by this lemma f®)(0)=0, 21. But then all the 
derivatives of f(x) as well as the function itself are zero at x =0. 


nit) 
Ge 
ck 


942] QUASI-ANALYTICITY 


On the other hand, since w<2/G—1=1—2/G’ <1, if r>1: 


T(r) .u.b. = .u.b. = 
M, M), +1 M), +1 


= r°T.(r). 


Therefore from the divergence of {Plog T..(r)dr/r? results the divergence of 
J? log T(r)dr/r*, and so under our present hypotheses the class C;y,) is quasi- 
analytic. Therefore f(x) =0 and the theorem is proved. 


Part II 


5. If f(x) belongs to Cfy,;, G=lim supa... (v,/m) is always less than or equal 
to 2. For, the sequence ea of the orders of the derivatives supposed zero 
at the origin must contain all the odd integers, since all the odd derivatives are 
zero there. If lim sups.. |f‘”(0)|/"< © with no further restrictions on G, 
we will prove that f(x) is a trigonometric polynomial 


THEOREM IV. If f(x) belongs to Cha.) where fflog T(r)dr/r?= 0, and if 
lim supp.» |f‘”(0)|/"< 0, then f(x) is a trigonometric polynomial. 


In this case we consider the function F(z) defined by 8 


F,(z) = = (1 — = 
0 


Suppose 
f(x) = ¥ an cos mz. 


m=0 


Then 


(17) Fi(z) = am cos mx dx = 2(1 — e***) = 

The above termwise integration is valid and the series obtained converges uni- 
formly in any bounded closed region; that is, the remainder ))m-,4m/(z*-+m?) 
tends uniformly to zero as m tends to infinity. As in the case of Part I, since 
JTlog T(r) dr/r?= ©, F,(z) is given by 


(m)(0 


in |z| >R, where R=lim sup,.. | f(0)| =<, But this latter series con- 
a to a holomorphic function in |z| >R and so it follows from (17) that 
=0 for m>R. This proves the theorem. 
ie In the case in which it is no longer assumed that lim supn.. |f‘"(0)| 
< «©, we have the analogue of Theorem III of Part I:- 


281 


282 S. MANDELBROJT AND F. E. ULRICH 


THEOREM V. Suppose f(x) belongs to Cfy,,; and f’”(0)=0 (n21). If 
G=lim supa... (¥n/n) <2 and if there exists a positive constant w<2/G—1 such 
that {* log T(r) dr/r? = «© , then f(x) is identically zero. 


Here again we consider the function F(z) and from (2) we have 


Then in R(z) =y>0, 
| F(z) f(m™(Q)| /k"M, 


where c’ is a constant. 
If we let z=e* and V(s) = F,(e*)/(1—e-?**), on recalling that =0 
(n 21), this last inequality becomes 


n 


m=1 


But this is precisely inequality (16) with c replaced by c’. Therefore in this 
case also the theorem now follows immediately from Lemma VII. 


BIBLIOGRAPHY 


1. Vladimir Bernstein, Lecons sur les Progrés Récents de la Théorie des Séries de Dirichlet, 
Gauthier-Villars, Paris, 1933. 

2. T. Carleman, Les Fonctions Quasi-Analytiques, Gauthier-Villars, Paris, 1926. 

3. A. Denjoy, Sur les fonctions quasi-analytiques de variable réelle, Comptes Rendus de 
l’Académie des Sciences, Paris, vol. 173 (1921), p. 1329. 

4. S. Mandelbrojt, Séries de Fourier et Classes Quasi-Analyliques de Fonctions, Gauthier- 
Villars, Paris, 1935. 

5. , Rice Institute Pamphlet, vol. 29, 1942. 

6. , Séries Lacunaires, Actualités Scientifiques et Industrielles, Exposés sur la théo- 
rie des fonctions, 1936. 

7. S. Mandelbrojt and J. J. Gergen, On entire functions defined by a Dirichlet series, Ameri- 
can Journal of Mathematics, vol. 53 (1931), p. 1. 

8. A. Ostrowski, Uber Quasianalytische Funktionen und Bestimmtheit Asymptotischer Ent- 
wickelungen, Acta Mathematica, vol. 53 (1930), p. 181. 


Tue Rice INsTITUTE, 
Houston, TEXAS 


A UNIFIED THEORY OF PROJECTIVE SPACES AND 
FINITE ABELIAN GROUPS 


BY 
REINHOLD BAER 


The similarity between finite-dimensional projective spaces and finite 
abelian groups has often been noted(*); and thus one may expect that the 
more general features of these two theories are identical. But the likeness is 
more than a superficial one; and consequently it is possible to give a unified 
treatment for spaces.and groups. 

It may be worthwhile to indicate in a few lines the developments leading 
up to a scientific situation that made such a joint theory as we are offering 
here a possibility. The evolution of geometrical thought pertinent to our 
problem is perhaps best described by two textbooks: Bécher’s Higher Alge- 
bra which exposed the identity of geometry and the theory of linear equa- 
tions; and Veblen and Young’s Projective Geometry whose presentation of 
the theory broke down the restriction to the two geometries over the real 
and the complex number field; and enlarged the domain to be considered to 
the projective geometries over any sort of field, whether finite or infinite, 
commutative or not. Any further progress had to be a progress in the theory 
of linear equations; and this was found in the treatment of the theory with- 
out using determinants—a concept that had to be thoroughly debunked to 
make these (and other) extensions possible. This was the starting point for 
further generalizations, generalizations in a direction that is different from 
ours—notably the theory of linear equations with infinitely many variables 
and in particular its geometrical counterpart, J. von Neumann’s continuous 
geometry. 

In the theory of finite abelian groups only one generalization was needed. 
We refer to the extension of this theory by introducing the concept of an 
abelian group which admits operators from some given ring or some other 
domain. For this concept makes it possible to consider projective geometry 
a—rather special—chapter in the theory of abelian groups, since the n-di- 
mensional projective space over the (not necessarily commutative) field F of 
. coordinates is nothing but the set of F-admissible subgroups of an abelian 
group of rank +1 over F and the linear forms over this geometry are just 
the characters of this underlying group. Our problem is now easily stated: 
to characterize a class of abelian operator groups which comprises both the 

Presented to the Society, April 11, 1941; received by the editors February 13, 1941, and, in 
revised form, September 19, 1941. 


(1) See, for example, R. D. Carmichael, Finite geometries and the theory of groups, American 
Journal of Mathematics, vol. 52 (1930), pp. 754-788. 


283 


284 REINHOLD BAER [September 


finite abelian groups and the abelian groups over fields as special cases; and 
to develop a theory of this class of groups which contains finite-dimensional 
projective geometry and finite commutative group theory as special cases. 

It is clear from the preceding remarks that the theory of abelian operator 
groups may be considered from two rather different points of view, for it is 
both a special chapter in the theory of groups and a generalization of projec- 
tive geometry and the theory of linear forms; and according to the point of 
view preferred, one will write the group composition as “multiplication” or 
“addition,” the operators as “exponents” or “multipliers.” 

The present investigation has been divided into three parts and this di- 
vision has been guided by the customary organization of projective geome- 
try. There is first the synthetic theory which deals with the subgroups and 
their combinations and which does not make any use of the elements of the 
underlying group, their addition or their multiplication by operators. The 
analytic theory concerns itself with those facts that either cannot be proved 
without making use of the group elements and their combinations (or equiv- 
alent hypotheses) or which actually involve them in their statement. The 
third part is devoted to the construction of the underlying abelian operator 
group to a given partially ordered set meeting certain requirements ( =intro- 
duction of coordinates in the terminology of projective geometry) and which 
may be considered the main contribution of this investigation. Though this 
part takes an intermediate place between the synthetic and the analytic 
theory, we had to place it at the end for technical reasons.—A little more 
detailed account of the contents of these three parts will be found in the 
introductions prefacing them. 


Part I. THE SYNTHETIC THEORY 


The system of admissible subgroups of an abelian operator group is 
known to form a partially ordered set, containing the sums and cross-cuts 
of its elements and obeying Dedekind’s law, in short, a Dedekind set(?). Thus 
it is natural to make Dedekind sets the framework around which to build 
the synthetic theory. The atoms of-a projective space are its points; and 
likewise one may consider as the atoms of a finite abelian group its cyclic 
subgroups of order a power of a prime. Consequently we choose as the atoms 
of our theory the cycles, that is, elements in a Dedekind set whose parts form 
a finite ordered set. In order to obtain a satisfactory theory including in 
particular the theory of dimension (or rank), the existence of complementary 
subspaces and the basis theorem for finite abelian groups, it turns out to be 
necessary and sufficient to impose the following two conditions, upon the 


(?) There exists an extensive literature on partially ordered sets, notably the work of 
G. Birkhoff and O. Ore; for a survey of this theory, see G. Birkhoff, Lattice Theory, American 
Mathematical Society Colloquium Publications, vol. 25, 1940. It should be noted that we use 
only very elementary parts of this theory, and that we state in full whatever we use. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 285 


Dedekind set under consideration. (1) Every element is a sum of cycles. 
(2) A quotient system is a cycle if, and only if, it contains at most one 
smallest element not zero (=subcycle of order 1 = point). 

To obtain a satisfactory geometry one has to assume that lines carry at 
least three different points. The group-theoretical counterpart is the restric- 
tion to primary groups (=groups of order a power of a prime). Thus we 
have to prove a generalization of the reduction theorem stating that every 
finite abelian group is the direct sum of its primary components (= partition 
into relatively prime, primary components) ; dnd we show that restriction to 
primary systems is equivalent to substituting for condition (2) the following 
condition. (2’) A quotient system is a cycle if, and only if, it contains at 
most two different smallest elements not zero. 

1. Preliminaries(*). The framework for our investigation of the system of 
admissible subgroups of a (primary) abelian operator group is provided by 
concepts like “partially ordered set,” “lattice,” “structure,” and so on. Since 
these systems will be required to satisfy Dedekind’s law, we shall term them 
“Dedekind sets.” Such a Dedekind set is a system D of elements connected 
by the following three relations: the cross-cut, meet, intersection or product 
fg of the elements f, g; the join or sum f+g of the elements f and g; the rela- 
tion f <g (in words: f is a part of or contained in g). The rules by means of 
which the first two operations may be reduced to the third one, and con- 
versely, may be stated as follows: 

fg is the greatest element contained in both f re g. 

f+g is the smallest element containing both f and g. 

f sg; f=fg and f+g=g are equivalent assertions. 

Let us add finally that the relation: f Sg is refiexive and transitive; and that 
signifies: f~g but f Sg. 

To these elementary rules we add the existence of a null-element 0 and 

we impose the main requirement, namely 


DEDEKIND’s LAW. If f, g, h are three elements in D, and if fSg, then 
f+gh=(f+h)g. 


This is a partial substitute for the distributive law; and we are going to 
derive from it some further useful formulas. 


Lemma I.1.1. If the four elements a, b, c, d in the Dedekind set D satisfy 
(a+b) (c+d) =0, then 


(a+c)(b+d) =ab+cd =(a+d)(b+c). 


Proof. It suffices to prove the first of these equations; and this may be 
done as follows: 


(*) In this section we collect a number of elementary facts from the theory of partially 
ordered sets in the form best suited to our purposes (see Footnote 2). 


286 REINHOLD BAER [September 


(a+c)(b+d) 0c) 
= (a+c)(b+d(c+ (d+ c)(a+ 
= (a+ de) = de+ b(a+c) = de+ b(a+ b)\(a+c) 
= dc+ b(a+c(a+ b)) = dc + ab. 


Lemma I.1.2. If a, b, c are three elements in-D such that ab =(a+b)c=0, 
then a(b+c) =b(c+a) =0. 


For a(b+c) =a(a+b)(b+c) =a(b+c(a+5)) =ab=0. 

The elements x, ---, X, are said to be independent, if Xi), for 
i=1,---,m;and it is readily inferred(*) from Lemma I.1.1 that the elements 
x1, ++, are independent if, and only if, 

If the elements x, - - - , x, are independent, then )-7_,x;=s is the direct 
sum of the x; and every summand <x; is a direct summand of s. The following 
statements are easily verified. If x is the direct sum of the x;, and if x; is 
the direct sum of the elements x;;, then x is the direct sum of the x;;. If s 
is the direct sum of ¢ and u, and if v is an element between u and s, then? is 
the direct sum of u and vt; so that a is a direct summand of 8, if a is a direct 
summand of and if a<bSc. 

If u<v, then the set v/u of all the elements x in D which satisfy u<x<v 
is a Dedekind set (exactly as D) and uw is the null-element of v/u. If in par- 
ticular u =0, then v/0 is the set of all the parts of v; and it will be possible to 
write v instead of v/0 without causing confusion. : 

Any biunivoque and monotonically increasing correspondence, mapping 
the elements of the Dedekind set D upon the elements of D’ is termed an 
isomorphism or a projectivity(®) of D upon D’. If in particular u and v are 
two elements in D, then an isomorphism of (u+-v)/u upon v/(uv) is defined 
by mapping x in (u+v)/u upon xv in v/(ur). 

2. Cycles and their orders. If « and v are elements in a partially ordered 
set (in particular in a Dedekind set), then u is said to be a cycle modulo », 
if v<u, if there exists only a finite number of elements between v and u, 
and if of two elements between u and »v one is always part of the other, that 
is, if u/v is a finite ordered set. If u is a cycle modulo v, then we denote by 
n(u/v) the number of elements between u and v which are different from v 
and term this number the order of the cycle u/v. Instead of n(u/0) we write 
n(u) and we say that wu is a cycle instead of saying a cycle modulo 0. If the 


(*) Cf. K. Menger, Annals of Mathematics, (2), vol. 37 (1936), pp. 456-482. 

(*) As long as only partially ordered sets are discussed, we prefer the term “isomorphism” 
as the more appropriate one. But as soon as we have to connect the “isomorphisms” of partially 
ordered sets (of subgroups of a group) with the “isomorphisms” of groups, we shall use the- 
term “projectivity” in order to avoid confusion. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 287 


cycle z is part of the element e, then we express this fact by saying that z is 
a subcycle of e. 


THEOREM I.2.1. If z is a subcycle of a sum of cycles whose orders do not 
exceed m, then the order of 2 does not exceed m either. 


Proof. It is a well known fact(*) that every part of a sum of a finite num- 
ber of cycles of order 1 (of points in projective geometry) is itself a sum of a 
finite number of cycles of order 1; and from this fact one readily infers our 
theorem in the special case m = 1. 

We now proceed by induction with regard to m, assuming the theorem. 
to be true for m and proving it for m+1. Let s be the sum of the cycles 2; 
of order not exceeding m+1; and suppose that z is a subcycle of s. If 
n(z;) =m-+1, then denote by y; the uniquely determined subcycle of order m 
of 2;; and if n(z;) Sm, then put 2; =y,;. If ¢ is the sum of the y;, then it follows 
from the induction hypothesis that the subcycle ¢z of ¢ has an order not ex- 
ceeding m. We note furthermore that s/t is the sum of the cycles (¢+2;) /t 
whose orders do not exceed 1, since they are isomorphic to z;/(z;) and since 
yi %2¢. Thus it follows from the special case m =1 that the subcycle (z+) /t 
of s/t is of an order not exceeding 1 so that n(z/(zt)) <1. Since we already 
pointed out that m(zt)<m, we find now that n(z) =n(2t)+n(z/(2t)) sm+1, 
as was to be shown. 


THEOREM I.2.2. If 2 is a subcycle of the direct sum s of the cycles 2;, if n 
is a positive integer, if y; is the uniquely determined subcycle of 2; whose order 
is exactly the minimum of the numbers n and n(2;), and if y is the sum of the 
cycles then n(z) Sn is a necessary and sufficient condition for 2S y. 


REMARK. It is easy to give examples showing that the independence of 
the cycles 2; is indispensable for the validity of this theorem. 

Proof. It is a consequence of Theorem 1.2.1 that the orders of the sub- 
cycles of y do not exceed n, since m(y;) S”.—Thus assume conversely that 
n(z) <n. If k is the number of cycles z;, then we put s(é) = 085 
and ¢(4) =s(4) so that in particular =s and =y. Since z is a sub- 
cycle of #(0) we are going to prove by complete induction with regard to 7 
that z is a subcycle of each ¢(i). Thus we assume that z S/(i) and we have to 
prove that zS#(i+1). From the induction hypothesis it follows that 
$(4) Ss(4) +2 St(t) = s(t) +2441. Since (s(t) +2) /s(t) and 2/(zs(t)) are isomor- 
phic cycles, and since the order of z does not exceed n, it follows that 
n((s(4) +2) /s(t)) Sn. Since s is the direct sum of the cycles z,;, it follows that 
0 so that 2:4; and ¢(i)/s(i) are isomorphic cycles. Consequently 
s(4) +2 S s(t) +yi41 +1); and this completes the proof. 

We note without proof the important fact that the maximum and the 
minimum conditions are satisfied by the parts of a sum of a finite number of 


(*) Cf. Menger, loc. cit. 


288 REINHOLD BAER [September 


cycles; it is an obvious consequence of the fact(”) that the maximum and 
minimum conditions are satisfied by the parts of a+), if they are satisfied by 
the parts of a and of b. 

3. Direct decompositions. The part v of the element w (in the Dedekind 
set D) is said to be closed in(*) w, if to every cycle 2 Sw such that 2v~0 there 
exists a cycle of order n(z) between zv and v. 


THEOREM 1.3.1. Every direct summand of w is closed in w. 


Proof. If w is the direct sum of u and 1, if z is a subcycle of w such that 
vz#0, then c=v(u-+z) is between zv and v. Thus cu=0 and zu=0 are 
consequences of the fact that the only subcycle of order 1 of the cycle 
z is in v. Hence c=c/(cu) is isomorphic to (¢+u)/u=(v(u+z)+u)/u 
=((v-+)(u+z))/u=(w(ut+z))/u=(u+z)/u as follows from Dedekind’s 
law; and (u+2z)/u being isomorphic to z2/(uz) =z, it follows that c and z are 
isomorphic so that they are cycles of equal order. 


THEOREM I,3.2. If 2 is a subcycle of maximum order of the direct sum s 
of the cycles c(1),+ ++, c(k), then there exists an i such that s is the direct sum 
of 2 and of the cycles c(j) for j ¥i. 


Proof. We may assume that m(c(z)) =n(z) if, and only if, 1 Sih; and we 
note that 0<h by Theorem I.2.1. If v(¢) then []?_,0(4) 
=v. Since v is a direct summand of s, v is closed in s by Theorem 1.3.1; and 
since the maximum order of the subcycles of v is smaller than n(z)—by 
Theorem I.2.1—vz =0. Thus there exists at least one i between 1 and h such 
that the subcycle z* of order 1 of z is not contained in v(4); since thus 
zv(i) =0, and since s is the direct sum of v(i) and of the cycle c(z) of order 
n(z), it follows finally that s is the direct sum of z and of v(#). 


Coro.iary 1.3.3. Every part of w is a direct sum of cycles if, and only if, 
the following conditions are satisfied. 

(i) Every part of w is a sum of cycles. 

(ii) If zis a subcycle of maximum order of the part v of w, then 2 is a direct 
summand of v. 


The necessity is an immediate consequence of Theorem 1.3.2, the suffi- 
ciency may be proved inductively, since every part v~0 of w is the direct 
sum of a subcycle of maximum order (in v) and of some smaller element, and 
since (i) implies the minimum condition for the parts of w. 


CoROLLARY 1.3.4. If the element s is both the direct sum of the cycles c(i) 


(7) Cf. Birkhoff, loc. cit. 

(*) This concept has been introduced by H. Priifer into the theory of primary abelian 
groups (under the name “Servanzuntergruppe”); cf. H. Priifer, Mathematische Zeitschrift, 
vol. 17 (1923), pp. 35-61. 


M 
. 
4 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 289 
and the direct sum of the cycles d(j), then the number of cycles c(i) of order n is 


the same as the number of cycles d(j) of order n. 


For if d(1) is a cycle of maximum order in s, then it follows from Theo- 
* rem 1.3.2 that s is the direct sum of d(1) and of the c(i) for i*k. Thus d(1) 
and c(k) are of the same order and ):<,d(i) and }>;,.c(j) are isomorphic; 
and now the statement may be proved by induction. 

The element w splits, if every part of w is a direct sum of (a finite number 
of) cycles, and if every closed part of any element v Sw is a direct summand 
of v. The following characterization of splitting elements will be needed in 
the proof of the main theorem of this section. 


THEOREM I.3.5. The element w splits if, and only if, it satisfies the following 
conditions. 

(i) Every part of wis a sum of cycles. 

(ii) If t<ssw, if t is a subcycle of maximum order of s, and if s/t is a 
cycle, then s contains a cycle of order 1 which is not part of t; and if p is a sub- 
cycle of order 1 not part of t, then s is the direct sum of t and of a cycle con- 
taining p. 


Proof. If w splits, and if s and ¢ satisfy the hypotheses of (ii), then ¢ is 
closed in s, therefore a direct summand of s so that s is the direct sum of ¢ 
and of some cycle z. Thus there exist subcycles of order 1 of s which are not 
in t. If p is some such cycle of order 1, then denote by ¢ a cycle of greatest 
order between p:and v. Then c is closed in v and therefore a direct summand 
of v. Since n(z) Sn(t), it follows now from Corollary 1.3.4 that n(c) =n(z); 
and fc =0 implies now that v is the direct sum of c and ¢. Thus splitting ele- 
ments satisfy (i) and (ii). 

For the sufficiency proof it will be convenient to say that the part r of s 
is weakly closed in s, if subcycles of order 1 of r which are contained in sub- 
cycles of order m of s are contained in subcycles of the same order of r. 

‘Suppose now that w satisfies (i) and (ii), that r<v<Sw and that r is 
weakly closed in v. Since r and v are sums of cycles, there exists a subcycle x 
of smallest order of v which is not part of r. If x* is the subcycle of order 1 of 
x, then x* Sr would imply the existence of a cycle y of order n(x) between x* 
and r. It follows from Theorem 1.2.1 that n(y) is the maximum order of the 
subcycles of x+y; and thus it follows from (ii) that y is a direct summand of 
x+y which proves the existence of a cycle of order smaller than n(x) which 
is part of v but not of r. This contradiction shows that xr=0. Thus there 
exists a subcycle z of v such that zr =0 and such that the order of z is as big 
as possible. Since z~0, this implies r<7-+-z. Suppose now that # is a subcycle 
of order 1 of r+2, that b is a cycle between p and v. If <r, then there exists 
a cycle of order n(b) between p and r. If the inequality <r does not hold, 
then br =0 so that n(b) Sn(z). Thus we may assume that pz=0 in order to 


q 


290 REINHOLD BAER * [September 


prove that r+z is weakly closed in v. Then g=r(p+z2) is a cycle of order 1, 
since p and (p+2)/z=(q+z)/z are isomorphic. Since gz=0, it follows from 
(ii) that b+<2 is the direct sum of z and of a cycle d containing g. Since bz =0, 
n(b) =n(d). Since g Sr, and since r is weakly closed in v, there exists a cycle e 
of order n(d) between g and r. Since pSq+zSe+z2, there exists by (ii) a 
cycle f containing p such that e+2 is the direct sum of f and z. It follows from 
Corollary 1.3.4 that »(f) =n(d) =n(b); and thus we have finally shown that 
r+z is weakly closed in v. Since by (i) the maximum condition is satisfied by 
the parts of w, it follows now by induction that v is the direct sum of r and 
of some cycles. Thus we have shown that (i), (ii) imply the splitting of w 
and imply that every weakly closed part is a direct summand (and therefore 
closed). 

If the element v is part of the element u, then u splits modulo v, whenever 
u splits in the Dedekind set u/v (which consists of all the elements between v 
and u and whose null is v). The element w splits completely, if every part u 
’ of w splits modulo each of its parts v. We mention the important and well 
known fact that both finite abelian groups(*) and finite-dimensional projec- 
tive geometries(!*) split completely. 


THEOREM 1.3.6. The element w splits completely if, and only if, it satisfies 
the following conditions. 

(i) Every part of w is a sum of cycles. 

(ii) If rSssw, and if s/r contains at most one subcycle of order 1, then 


s/r is a cycle. 


Proof. The necessity of these conditions is an immediate consequence of 
the fact that subcycles of maximum order are closed, and that s/r is a sum 
of cycles, if s is a sum of cycles. 

Before proving the sufficiency of (i) and (ii) we prove the following help- 
ful lemma. 


(1.3.6.1) If sis a sum of cycles, if t<s and if s/t is a cycle then there exists a 
cycle z such that s =t+z. 


For there exists between ¢ and s one and only one element r such that 
s/r is a cycle of order 1. Since r<s, not every subcycle of s is part of r. If z is 
a subcycle of s, though not of 7, then the inequality t+z2<r does not hold 
so that t+z=s, since s/t is a cycle. 

Suppose now that (i) and (ii) are satisfied by w, that t<s<w, that tisa 
subcycle of maximum order of s and that s/t is a cycle. Since s is therefore 
not a cycle, it follows from (ii) that s contains at least two different subcycles 
of order 1, one of which is certainly not part of t. Suppose now that p is a sub- 

(%) Cf. Priifer, loc. cit. 


(#°) The central importance of this fact for projective geometry has been stressed by 
Menger, loc. cit., and by G. Birkhoff, Annals of Mathematics, (2), vol. 36 (1935), pp. 743-748. 


[| 
. 

; 

q 

1 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 291 


cycle of order 1 of s and that the inequality »<# does not hold or pt=0. 
Denote by 6 a cycle of greatest order between p and s. If g is an element 
between b and s such that g/d is a cycle of order 1, then g is no cycle so that g 
contains by (ii) a subcycle g of order 1 different from p. Since it follows from 
(1.3.6.1) that s is the sum of ¢ and of some other cycle, it follows that the 
sum s* of all the subcycles of order 1 of s is the sum of any two of its (differ- 
ent) subcycles (of order 1) so that s*=p+gq or g=b+s*. Thus s/b contains 
one and only one subcycle of order 1; and (ii) implies consequently that s/b 
is a cycle. Hence it follows from (1.3.6.1) that s is the sum of b and of some 
cycle so that n(s/b) Sn(t). Since tb =0, it follows now that s is the direct sum 
of ¢ and of the cycle 6 containing ». Thus we showed that the conditions of 
Theorem 1.3.5 are satisfied by w, if w satisfies (i) and (ii); that is, if w satisfies 
our conditions (i) and (ii), then w splits. But if w satisfies conditions (i) and 
(ii), and if uv sw, then v/u satisfies these conditions so that v/u splits too. 
Thus w splits completely, if it satisfies the conditions (i) and (ii). 

The elements u and v are termed relatively prime, if u and v are not both 0, 
if uv=0, and if xSu+v implies x =xu+xv. The decomposition of a finite 
abelian group into its primary components is an example of a decomposition 
into a sum of relatively prime elements. The elements wu and v are said to be 
relatively prime modulo their common part #, if they are relatively prime 
elements in the Dedekind set (u+v)/t. Finally we say that the element w 
is primary, if there does not exist any triplet of elements u, v, ¢ such that 
tsSuvSu+vsw and such that uw and v are relatively prime modulo ¢. The 
system of subgroups of an abelian group of order a power of a prime furnishes 
an example of a primary system; and projective geometries whose lines carry 
at least three points are primary too. 

If the sum of the two cycles p and g of order 1 contains just these two 
cycles and no further cycles (of order 1), then p and q are relatively prime.— 
If « and v are relatively prime, and if u and v are sums of cycles, then u con- 
tains a cycle of order 1, v contains a cycle g of order 1, and p+q contains 
just these two cycles and no further ones. Combining these remarks with 
Theorem 1.3.6 we obtain the following fundamental theorem. 


THEOREM 1.3.7. The element w splits completely and is primary if, and only 
if, it satisfies the following conditions. 

(i) Every part of wis a sum of cycles. 

(ii) If rSs sw, and if s/r contains at most two different subcycles of order 1, 
then s/r is a cycle. 


An n-dimensional projective geometry whose lines carry at least three 
points possesses systems of +1 points no m of which are on a hyperplane. 
To generalize this property which will be of importance in the future we say 
that the elements (7) form a partial sum of the elements u(i), if v(i) S u(t) 
fori=1,---+-,m. If we have v(t) <u(i) for at least one i, then the v(7) form a 


292 REINHOLD BAER © [September 


proper partial sum of the u(t). If the u(z) are independent, and if the v(z) form 
a partial sum of the u(7), then they form a proper partial sum if, and only if, 


Lemma 1.3.8. If the primary element s is the direct sum of the cycles c(i), 
af the parts of s are sums of cycles, then there exists a subcycle of s which is not 
contained in any proper partial sum of the c(i); if the subcycle z of s is not con- 
tained in any proper partial sum of the c(i), then 2 is a subcycle of maximum 
order of s and s is the direct sum of 2 and of the c(i) for i#k, provided c(k) is of 
maximum order too. 


Proof. If we denote by c(z)’ the uniquely determined subcycle of c(i) such 
that c(i)/c(t)’ is a cycle of order 1 and by s’ the sum of the c(t)’, then s/s’ 
is an (m—1)-dimensional projective geometry (a direct sum of the cycles 
c(t) /c(t)’ of order 1). Hence there exists in s/s’ a cycle c of order 1 which is 
not part of any proper partial sum of the c(t)/c(i)’. But c may be seen to be 
the sum of s’ and of a cycle z which meets the requirements.—The second 
statement is obvious. 

4. Partition into relatively prime, primary components. The elements 
M,°°* , Un constitute a partition of their sum s, if s is the direct sum of the 
uz, and if u; and )> ju; are relatively prime for every j. Note that the 
primary components of a finite abelian group constitute a partition of this 
group. If wis a completely splitting element in a Dedekind set, then w is 


the direct sum of cycles and therefore of primary elements; but there exist 
examples of completely splitting elements which do not admit of a parti- 
tion into relatively prime, primary elements. 


THEOREM I.4.1. (a) The element w in the Dedekind set D admits of at most 
one partition into (relatively prime) primary elements. (b) The completely split- 
ting element w admits of a partition into (relatively prime) primary elements if, 
and only if, any two subcycles with non-primary sum are relatively prime. 


Proof. Assume first that the elements u; as well as the elements 9; consti- 
tute a partition of the element w. Then the elements u@;#Oforj=1,---,h 
constitute a partition of u; so that the elements 1,0 constitute a partition 
of w. Statement (a) is now a consequence of the fact that primary elements 
do not admit of partitions into (more than one) relatively prime element. 

Suppose now that the primary elements #1, - - - , , constitute a partition 
of the completely splitting element w. If z is a subcycle not 0 of w, z2* the 
uniquely determined subcycle of order 1 of z, then z 12Pi, 2* =) 12" Dy. 
Consequently there exists a subscript j such that 2*p,;=2*, 2*pi=0 for i#j; 
and this implies 2 2p; =0 for so that subcycle of w is contained 
in one and only one component pi. 

If « and v are two subcycles of w, then they are either contained in the 
same component p;—in which case their sum is primary—or else they are in 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 293 


different components and then they are relatively prime proving the neces- 
sity of the condition of (b). 

Suppose now that the condition of (b) be satisfied by the subcycles of 
the completely splitting element w. If the part ¢ of w is not primary, then © 
there exist elements x, y such that x<y3St and such that y/x consists of 
exactly four elements, namely x, y and two cycles of order 1. Since w and 
therefore ¢ splits completely, this implies the existence of two subcycles u, 
v not 0 of ¢ such that uv=0 and such that u+v2 is not primary. Hence it 
follows from the hypothesis that u and » are relatively prime.—Suppose 
now that s<t, that ¢/s is a cycle of order 1, and that s is primary. If we de- 
note by c’ the uniquely determined subcycle of order m(c)—1 of the cycle 
c#0, then c’ Ss for every subcycle c¥0 of ¢ so that in particular u’+v’ Ss. 
But the inequality u+v<s does not hold since u+v is not primary, though 
s is primary. Thus not both cycles u and v are subcycles of s. Since 
t/s =(s+u+v)/s is a cycle of order 1 and is isomorphic to (u+v)/s(u+v) 
=(u-+v) /(su+sv)—as u and vare relatively prime—it follows now that at least 
one of the cycles u and v is part of s. Thus we assume that us, and that 
the inequality v<s does not hold. Since u+v’Ss, u+v’ is primary; and 
since u and »v are relatively prime, this implies v’ =0 so that v is a cycle of 
order 1. If now c is a subcycle of order 1 of s which is different from the sub- 
cycle u* of order 1 of u, then u*+c is the direct sum of two cycles of order 1 
and is primary as a part of s. Hence there exists a subcycle z of order 1 of 
u*+c which is different from c and u* so that c+u*=u*+z2=2+c. Then 
c+v=(c+v)/z(c+v) is isomorphic to (2+c+v)/z=(z+u*+v)/z and this is 
isomorphic to (u*+v)/2(u*+v) =u*+v so that c and v are relatively prime 
too. If g is any subcycle not 0 of s, g* its subcycle of order 1, then g* and v are 
relatively prime; hence g+v is not primary and it follows from the hypoth- 
esis that g and v are relatively prime. Since every part of s is a direct sum of 
cycles, it follows now that s and v are relatively prime. Thus s and the cycle 
v of order 1 constitute a partition of t=s+v so that every subcycle of ¢ is 
either part of s or of v; and this shows in particular that v is the only sub- 
cycle r of t such that t=s+r. 

Since the maximum condition is satisfied by the parts of w, there exists 
some greatest primary part of w. Certainly p0, if w~0, since every sub- 
cycle of w is primary. If p=w, then w is primary so that we may assume that 
0<p<w. If z is any subcycle of w, then either z<p or +2 is not primary. 
In the latter case (p+2)/p is a cycle different from 0 and there exists a 
uniquely determined subcycle c of z such that (p+c)/p is a cycle of order 1. 
From what we have shown in the preceding paragraph of the proof it follows 
that c is a cycle of order 1 such that p and ¢ are relatively prime. If x is any 
subcycle of p, then x and c are relatively prime; since x+c $x+z2, it follows 
that x+2z is not primary; and hence it follows from the hypothesis that x 
and z are relatively prime; and this shows that p and z are relatively prime. 


294 REINHOLD BAER ‘° [September 


Thus every subcycle of w which is not part of p is relatively prime to p.— 
Denote now by g the sum of all the subcycles of w which are not part of p. 
Then g=2:+ -- + +2, where each of the 2; is relatively prime to p. If cis a 
subcycle of order 1 of p, then c and 2; are relatively prime. Since cz,;=0, we 
may assume that cgi-1 =0 for gis =)5-12;. If c(qi1+2;) ¥0, then 
Hence ¢+2;= =2;, since and 4; are rela- 
tively prime, and since therefore 9;-1(¢-+2;) =qi-1€ + = by the in- 
duction hypothesis; but this would imply c <z;, an impossibility which proves 
cqg=0. This implies 2g=0 for every subcycle z of p; and hence every sub- 
cycle not 0 of w is part of one and only one of the two elements » and g. Since 
the parts of w are sums of cycles, this implies that » and g are relatively 
prime; and from the validity of maximum and minimum condition for the 
parts of w one now readily infers that the greatest primary parts of w con- 
stitute a partition of w into relatively prime, primary elements. 

5. Sums and products of infinite sets. It is well known that a great part 
of the theory of finite abelian groups, in particular the basis theorem, holds 
true for abelian groups the orders of whose elements are bounded. Cross- 
cuts of any number of subgroups and sums of any number of subgroups are 
subgroups too; and we propose to give in this section a short analysis of the 
pertinent concepts. 

If S is a set of elements in the Dedekind set D, then the element pin Disa 
product of S, if p is a greatest element contained in all the elements in S; and 
the element s in D is a sum of S, if s is a smallest element containing every 
element in S. It is readily verified that there exists at most one product and 
at most one sum of S. 

The set S of elements not 0 in D is independent, if there exists for every 
element ¢ in S the sum S(t) of the elements different from ¢ in S, and if 
tS(t) =0 for every tin S. 


THEOREM I.5.1(1!). Suppose that the element w satisfies the following con- 
ditions. 

(i) If the nonvacuous set T of subcycles of w contains every subcycle of any 
finite sum of cycles in T, then there exists one and only one part s(T) of w such 
that T is the set of all the subcycles of s(T). 

(ii) The orders of the subcycles of w are bounded. 

Then every part of w is the direct sum of each of its closed parts and of a 
finite or infinite number of cycles if (and only if) every finite sum of subcycles of w 
splits. 

REMARK. Condition (i) is satisfied in every abelian group without ele- 
ments of infinite order and is satisfied in the primary abelian operator groups 
too (see below).—That condition (ii) is indispensable for the validity of the 
theorem is well known. 


() H. Priifer proved this theorem for primary abelian groups. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 295 


Proof. We note first that on account of condition (i) sums and products of 
any number of parts of w exist, that furthermore the subcycle c of wis part 
of the sum of the set S of parts of w if (and only if) there exists a finite number 
of elements in S whose sum contains c, and that finally the set S of parts not 
0 of w is independent if, and only if, every finite subset of S is independent. 
We recall furthermore that—as in the proof of Theorem 1.3.5—the ele- 
ment u is termed weakly closed in the element 1, if u Sv, and if there exists 
a cycle of order m between » and u whenever # is a subcycle of order 1 of u 
which is contained in a subcycle of order n of v. 

Suppose now that u<v<w and that u is weakly closed in v. Then there 
exists a subcycle of v which is not part of u and one verifies—exactly as in the 
proof of Theorem I.3.5—that there exist subcycles not 0 of v which are inde- 
pendent of u. Hence it follows from condition (ii) that there exists a subcycle 
z of v such that zu =0 and such that the order n(z) is as big as possible. To 
show that the (direct) sum «+2 is weakly closed we need consider only sub- 
cycles p of order 1 of u+2 which satisfy: pu =pz=0. Suppose that 6 is some 
cycle between p and v. Then n(b) Sn(z) and bz =0. That g=u(p+z) is a cycle 
of order 1 is verified as in the proof of Theorem 1.3.5. Thus it follows from 
Theorem 1.3.5 and the fact that the direct sum b+2 of the two cycles bd and z 
splits, that there exists a cycle of order m(b) between g and 6+2v. Since 
q Su, and since u is weakly closed in v, there exists a cycle d of order n(b) be- 
tween g and uw. Since gz =0, and since the direct sum d+z of the two cycles d 
and z splits by hypothesis, it follows from Theorem I.3.5 that there exists a 
cycle of order n(d) =n(b) between » and d+zSu-+2; and thus it has been 
shown that u+z is weakly closed in v too. 

If the part 7 of the element v Sw is weakly closed in v, then denote by R 
the set of all the elements s between r and v such that s is weakly closed in v 
and such that s is the direct sum of r and of a finite or infinite number of 
cycles. If x and y are two elements in R, then x is said to be better than y, 
if x is the direct sum of y and of a finite or infinite number of cycles. From 
the remarks in the first paragraph of this proof it may be inferred that there 
exists a best element in R; and it is an immediate consequence of the results 
of the second paragraph of the proof that v itself is the only best element in R 
so that v is the direct sum of u and of a finite or infinite number of cycles. 

If the element w satisfies condition (i) of Theorem 1.5.1, then it follows 
from Theorem I.2.1 that there exists one and only one part w, of w such that 
the set of subcycles of w, is just the set of subcycles of w with an order not 
exceeding m. Every finite sum of subcycles of w is contained in almost every 
w,; and Theorem I.5.1 may be applied on the wp. 


Part II. THE ANALYTIC THEORY 


An abelian operator group may be termed a primary abelian operator 
group, if the system of its admissible subgroups meets the requirements 


296 REINHOLD BAER _ [September 


imposed in the first part. Not only do abelian groups of prime power order 
and the abelian operator groups underlying projective geometry belong into 
this class, but it is even possible to develop a theory of primary abelian oper- 
ator groups which is fully comparable to both projective geometry and the 
theory of finite abelian groups. For example, the duality between group and 
character group may be proved, a fact that specializes, in the case of pro- 
jective geometry, to the duality between the point space and the hyperplane 
space and which thus contains the theory of systems of linear equations. 
Further examples are extensions of the fundamental theorem of projectivity 
and of the theorem of Pappus. 

For our purposes it does not suffice to characterize the primary abelian 
operator groups as operator groups with specific properties. We have to solve 
the problem of determining those sets of subgroups of an abelian group 
which are the systems of all the admissible subgroups of a primary abelian 
operator group. A set L of subgroups of the abelian group G may be proved 
to be the system of all the admissible subgroups of G for a suitable set of 
operators, if L satisfies the following conditions: (a) L contains sums and cross- 
cuts of its subsets. (b) If the subgroup Z in L is the smallest subgroup in L 
containing a given element z, then the subgroups in L that are part of Z form 
a finite ordered set. (c) If a subgroup Z in L contains just m subgroups in L 
and if the subgroups in L that are part of Z form an ordered set, then there 
exist at least three independent subgroups of this kind in ZL. Under the same 
hypotheses we may prove that every projectivity of L is induced. by an 
isomorphism of the underlying group G. It seems noteworthy that both these 
theorems are obtained as special cases from one and the same construction. 

1. Construction of endomorphisms and isomorphisms. The composition 
of the elements in commutative groups G will be written as addition: x+y. 
Alinear transformation(**) of G into the commutative group H is a function f 
which maps every element g in G upon a uniquely determined element g‘ in H 
such that (g+h)'=g'+h‘. Linear transformations of G into G are termed 
endomorphisms(*) of G; and these shall be written usually as multipliers, that 
is, the endomorphism f of G maps the element g in G upon the element gf 
in G. 

If E is a set of endomorphisms of G, S is a subset of G, then SE is the set 
of all the elements se for s in S, e in E; and the subset of G is E-admissible, 
if SESS. The system of all the Z-admissible subgroups of G shall be denoted 
by D(G; E). It is one of the objects of this section to determine all the sys- 
tems D(G; E) meeting certain requirements. 

If L is a set of subgroups of G, then the endomorphism ¢ of G is L-admissi- 
ble, if Se<S for every subgroup S in L; and the set K(G; L) of all the L- 
admissible endomorphisms of G is a ring, provided addition and multiplica- 


Often termed “homomorphism.” 
(#8) Often called “auto-homomorphism,” “(proper or improper) automorphism,” and so on. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 297 


tion are defined as customary. If in particular L=D(G; EZ) for some system 
E of endomorphisms of G, then ES K(G; L) and L=D(G; K(G; L)), thouga 
in general it may happen that L<D(G; K). 

The system L of subgroups of G shall be termed a ring of subgroups, if 
contains 0, G and the cross-cuts and the sums (=join-groups) of each of its 
subsets. It is well known that rings of subgroups are Dedekind sets; and their 
importance for us lies in the fact that the sets D(G; E) are rings of sub- 
groups. 

If L is a ring of subgroups of the commutative group G, and if X is a sub- 
set of G, then the cross-cut of all the subgroups in L which contain X is a sub- 
group in L; and we call this subgroup the L-subgroup XL of G or the L-sub- 
group generated by X. If Z is an L-subgroup of G such that the L-subgroups 
contained in Z form a cycle (in the meaning of §1.2), then Z is called a cycle 
in L; if the L-subgroup Z is a cycle different from 0 in L, then Z contains ele- 
ments which are not contained in any proper L-subgroup of Z. If z is such an 
element in Z, then Z=zL so that we may say that cycles are cyclic. Since 
in general not every cyclic subgroup is a cycle, we define: The ring L of sub- 
groups of G is primary, if every cyclic subgroup gL in L is a cycle in L. 

If S and T<S are L-subgroups of G, then the L-subgroups X between 
T and S define the L-subgroups X/T of S/T. If L is a primary ring of sub- 
groups of G, then the L-subgroups of S/T form a primary ring of subgroups 
too. 

If L is a primary ring of subgroups of G, Z a cycle in L, then either Z =0 
or Z contains a uniquely determined subcycle Z* of order 1 (in L), a notation 
which we shall use occasionally. 

If g is an element in G, L a primary ring of subgroups of G, then gL is a 
cycle; and thus we may define the L-order of g by the equation n(g) =n(gL), 
the order of the cycle gZ in L. It is an immediate consequence of Theorem 
1.2.1 that the order of the element g in G does not exceed n, if g is the sum of 
elements in G whose orders do not exceed n. 

If x and y are two elements of L-order 1, then either xL =yZ or else the 
cross-cut of xL and yL is 0. In the latter case x+~ is not contained in xZ and 
not in yL so that xL+yL contains at least three subgroups of order 1. From 
this remark one infers readily that G is primary in the Dedekind set L (sing 
the definition of §1.3), if L is a primary ring of subgroups of G ("). 

We note finally that the elements x, y,-- + are termed (L-) independent, 
if the subgroups xL, yL,---~- are independent elements of the Dedekind set 
L (that is, if the cross-cut of xL and of the sum yL+ --- isQand soon). 

The following general theorem contains as special cases both the existence 
of the coordinates and the existence ‘of the semi-linear transformations of a 
projective geometry. 


(“) Whether or not a ring of subgroups of an abelian group, satisfying the conditions of 
Theorem 1.3.7 is a primary ring as defined in this section, seems to be an open problem. 


298 REINHOLD BAER_- [September 


THEOREM II.1.1. If L is a primary ring of subgroups of the abelian group G, 
if J is a primary ring of subgroups of the abelian group H, if L either does not 
contain any cycle of order n or at least three independent ones, if p is a pro- 
jectivity of L upon J, if gis an element in G and h an element in (gL)”, then there 
exists a linear transformation q of G into H such that 

(i) 

(ii) x? is for every x in G an element in (xL)?, 

(iii) m(x)—m(x*)=n(g)—n(h) for n(g)—n(h)Sn(x) and x*=0 for 
n(x) <n(g) —n(h). 


Proof(). We note first that p maps cycles in L upon cycles in J and that p 
preserves both order of cycles and independence of subgroups. In particular 
we have 0=0"? and H=G". If X is any subset of G, then it will prove con- 
venient to put p(X) =(XL)?. 


(11.1.1.1) If x and wv are two independent elements in G such that 0<n(x) 
Sn(y), then x and y—x are independent, n(y—x) =n(y) and the order of the 
cross-cut of yL and (y—x)L is n(y) —n(x). 


From the hypothesis it follows that xL+yL is the direct sum of xL and 
yL; and it is obvious that it equals xL+(y—x)L=yL+(y—<x)L. It is a con- 
sequence of Theorem I.2.1 that the order of y—x does not exceed n(y). Since 
yL and (xL+yL)/(xL) are isomorphic, it follows that (y—x)L is, modulo its 
cross-cut with xL, a cycle of order m(y); and this shows that n(y—x) =n(y) 


and that y—x and x are independent. Since xZ and (xL+yL)/(yL) are iso- 
morphic, (y—x)L is, modulo its cross-cut with yL, a cycle of order n(x); and 
the order of the cross-cut of yZ and (y—x)L is therefore n(y—x) —n(x) = 
n(y) —n(x). 

(11.1.1.2) If x and y are two independent elements in G such that 0<n(x) 
<n/(y), and if t is an element in p(y), then there exists one and only one element 
f(x, t; y) in p(x) such that f(x, t; y)=t modulo p(x—y). (Note that this state- 
ment holds trivially true for x =0 in which case f(x, t; y) =0.) 


The cross-cut of p(x) and p(x—y) is 0, since the cross-cut of xZ and 
(x—y)L is 0 by (II.1.1.1); and consequently there exists at most one solution 
of our congruence, since the difference of any two solutions would be an ele- 
ment in the cross-cut of p(x) and p(x—y).—Since xL+yL=xL+(x—y)L, 
it follows that p(x)+ p(y) =p(x)+p(x—y); and since ¢ is an element in 
p(x) +p(y), and Ja ring of subgroups, it follows that t=r-+s for r in p(x), s in 
p(x—y) or r=t mod p(x—y) so that r=f(x, ¢; y) is the required solution of 
our congruence. 


(4+) The method used in this proof is an adaptation and extension of a method employed 
by us previously in proving a similar theorem; cf. R. Baer, American Journal of Mathematics, 
vol. 61 (1938), pp. 1-44, in particular Footnote 10. 


i 
. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 299 


(I1.1.1.3) If x and y are independent elements in G such that n(x) <n(y), and 
if tis an element in p(y), then f(x, t; y) =0 for n(x) Sn(y) —n(t) and n(f(x, t; y)) 
= n(x) —(n(y) —n(t)) for n(y) —n(t) Sn(x). 


The order of the cross-cut of p(y) and p(y—x) is n(y) —n(x), since this is 
—by (II.1.1.1)—the order of the cross-cut of yZ and (y—x)L. Thus ¢ is an 
element in this cross-cut and therefore in p(y—x), if n(¢)Sn(y)—n(x). But 
if ¢ is in p(y—<x), then f(x, t; y) =0 by (II.1.1.2).—It follows from the defini- 
tion of f(x, t; y) that tJ+p(x—y) =f(x, t; y) J+p(x—y) =d. Hence f(x, t; y) 
=0 implies that ¢ is contained in p(x—y) and that therefore n(t)< 
n(y)—n(x). Thus f(x, t; y) #0, if n(y) —n(x) <n(t). Since xL and (x—y)L 
are independent, so are p(x) and p(x—y); and since f(x, t; y) is an element 
not 0 in p(x), f(x, t; y) and p(x—y) are independent, so that n(f(x, t; y)) = 
n(d/p(x—y)). But d/p(x—y) is isomorphic to tJ modulo its cross-cut with 
p(x—y); and this cross-cut equals the cross-cut of p(y) and p(x—y), since 
t is in p(y), but not in p(x—y). Thus it follows finally from (II.1.1.1) that 
n(f, t; y)) =n(t) —(n(y) —n(x)), as was to be shown. 


(11.1.1.4) If x, y, 2 are independent elements in G such that n(x) Sn(y) <n(z), 
then (2—x)L ts the cross-cut of 2L+xL and (g—y)L+(y—x)L. 


Clearly (z—x)L is contained in the cross-cut C of zL+xL and 
(s—y)L+(y—x)L.—It is a consequence of (I1.1.1.1) that zL+<xL is the direct 
sum of xL and (s—x)L; and (—x)L<C is therefore equivalent to dependence 
of xL and C. But this is impossible, since the cross-cut of xL and (y—x)L 
+(z—y)L is 0 as a consequence of (II.1.1.1) and our hypotheses; and thus 
C=(x—2z)L. 


(11.1.1.5) If x, y, 2 are three independent elements in G such that n(x) Sn(y) 
<n/(z), and if t is an element in p(z), then 


f(x, t; 2) = f(x, fly, 2); 9). 


Since f(x, f(y, t; 2); y) —t=f(x, f(y, t; 2); vy) —f(y, t; 2) +f(y, t; 2) —t is an 
element in the cross-cut of p(xL+zL) and p((x—y)L+(y—z)L), and since 
this cross-cut is p(x —z) by (II.1.1.4), it follows that the element f(x, f(y, t;z);) 
is contained in p(x) and satisfies f(x, f(y, t; 2); y)=t mod p(x—z). But it 
follows from (II.1.1.2) that f(x, t; z) is the only element meeting these re- 
quirements. 


(11.1.1.6) If x, y, 2 are elements in G such that xL+yL and zL are inde- 
pendent and such that n(x) Sn(y) Sn(z), and if t is in p(z), then 


f(x + y, t; 2) = f(x, t; 2) + f(y, t; 2). 


The proof of this statement will be effected in three steps. 
A. x and y are independent. 


= 
= 


300 REINHOLD BAER ~ [September 


In this case the three elements x, y, z are independent. Then it follows 
from (II.1.1.1) that y and z—y are independent, that n(z) =n(z—y) so that 
(s—y)L and xL+yL are independent too. Likewise y—x and x are inde- 
pendent, n(y) =n(y—x) so that the three elements x, y—x, z—y are inde- 
pendent and satisfy n(x) sn(y—x) Sn(z—y). Hence it follows from (II.1.1.4) 
that the cross-cut of xL+(z—y)L and (s—x)L+yL=(x—y—x)L+(2-y 
— (x—y))L is just (g—y—x)L. Thus the element v = (f(x, ¢; 2) +f(y, t;2)) —t 

‘is contained in p(x+y—z), since it is contained in the cross-cut of 
p(xL+(y—2)L) and p(yL+(x—z)L). But is contained in the 
cross-cut p(x+y) of p(xL+yL) and p((x+y—2z)L+2L) =p((x+y)L+2L). 
Thus v has been shown to be an-element in p(x+y) such that v= mod 
b(x+y—z); and this proves the required identity, since f(x+y, t; 2) is by 
(II.1.1.2) the only element satisfying these conditions. 

B. x=—y. 

We may assume x0, since f(0, ¢; z) =0. Then xL+2L is the direct sum 
of xL =yL and 2L. Since there exist at least three independent elements of 

_order n(z) in G, we may infer from Corollary I.3.4 that there exists an element 
v of order n(z) in G such that x= —y, v, 2 are three independent elements. 
It is a consequence of (II.1.1.1) that y and v+<x are independent elements 
too, n(v) =n(v+<x). Hence it follows from A that 


f(v, t3 2) =f(v + «+ y, 2) = flo + x, 2) + fly, 2) 


= f(v, t; 2) + f(x, t; 2) + f(y, t; 2) 


O = f(x, t; 2) + f(y, #; 2). 


C. x and y are not independent. 

We may assume that neither x nor y is 0, since otherwise nothing need be 
proved. Thus xL and yL are dependent cycles different from 0 so that they 
have the same subcycle of order 1: c* = (xL)* =(yL)*. Since both 2, x and 2, y 
are pairs of independent elements, it follows that (2zL)*+c* contains every 
subcycle of order 1 of xL+2L and of yL+2L. 

If the three elements x+y, y, 2 are independent, then x+y, —y, 2 are in- 
dependent too; and it follows from A and B that ; 

f(x, 2) = f(x t+ y — 9, = f(x ty, 2) + f(— 2) 
= f(x + y, t; 2) — fly, t; 2). 
. Thus we need only handle the case where x+y+0 and x+y and y are de- 
pendent so that as before ((x+y)L)* =c*. As under B there exists an element 
v in G such that (v) =m(z) and such that v and c*+2L are independent. Then 
the triplets of elements x, v, z and x+y, v, 2 are triplets of independent ele- 
ments whose orders do not exceed m(z). Since c* is the only subcycle of order 


1, contained in both xL+vL and yL+2L, and since x and:v+< are inde- 
pendent, it follows that y, v+ x, z is another triplet of independent elements 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 301 


whose orders do not exceed n(z). Thus it follows from A that f(v, t; 2) ++f(x+y, 
t; 2) =f(x+y+0, t; 2) =f(x+9, t; 2) +f(y, t; 2) =f(o, t; 2) +f(x, t; 2) +f(y, t; 2), 
completing the proof of the identity (II.1.1.6). 


(11.1.1.7) If x and z are two independent elements such that n(x) Sn(z), and 
if s is an element in p(x), then there exists an element t in p(z) such 
that s =f(x, t; 2). 


Since p(xL+2L) = p((x—2z)L+2L) =p(x—z)+ and since s is an ele- 
ment in p(xL+2L), there exist elements r, ¢ in p(x—z) and p(z), respectively, 
such that s=r+. Then s is an element in p(x) such that s=¢ mod p(x—z); 
and since ¢ is in p(z), it follows from (II.1.1.2) that s =f(x, t; 2). 

We denote by G,, the set of all those elements in G whose L-order does not 
exceed n. It has been pointed out at the beginning of this section that G, is a 
subgroup in L. The set H, of all the elements in H whose J-order does not 
exceed n is likewise a J-subgroup of H. 


(11.1.1.8) If 2 is an element of order n in G, and if t is in p(z), then there 
exists one and only one linear transformation f of G, into H, such that 2‘ =t, 
a is in p(x) for every x in Ga, n(x’) —n(x) =n(t)—n + for n—n(t)<n(x), but 
x! =0 for n(x) Sn—n(t). 


There exist by the hypotheses of Theorem II.1.1 two elements x, y of order 
n such that x, y, z are three independent elements. We put r=f(x, ¢; z) and 


s =f(y, t; 2). Since ¢t is an element in (z) such that ‘=r mod p(z—2), it follows 
from n(x) =n and (I1.1.1.2) that t=f(z, r; x) and likewise that t=f(z, s; y). 
Applying (11.1.1.5) we find that f(x, t; 2) =f(x, f(y, t; 2); v) =f(x, s;y) and like- 
wise f(y, t; 2) =f(y, 7; x). It is a consequence of (I1.1.1.3) that m(r) =n(s) =n(t). 

Any element v0 in G, belongs to one and only one of the following three 
classes. 

Class 1. v is intends of each of the three subgroups xL+yL, yL+2L 
and 2L+<xL. 

Class 2. v is independent of two of the three subgroups xL+yL, yL+<L, . 
zL+<xL and depends on the third one; and v is independent of each of the 
three subgroups xL, yL, 2L. 

Class 3. v is independent of one und only one of the three subgroups 
xL+yL, yL+2L, 2L+xL; and'v is dependent of one and only one of the 
subgroups xL, yL, zL. (Thus dependence of v and z would imply independence 
of vL and xL+yL.) We have to prove that these three classes exhaust all the 
possibilities. If v is independent of xL+yL, then v is independent of both x 
and y. If v is independent of two of the three subgroups xL+yL, yL+2L 
and zL+<xL, then v is therefore independent of each of the elements x, y, z.— 
If v depends on both xL+yL and yL+2L, then (vL)* is contained in the cross- 
cut yL of these two subgroups so that v and y are dependent. But then » is 
certainly independent of xL+2L. 


302 REINHOLD BAER [September 


It follows from (II.1.1.2) and this trichotomy that of the three functions 
f(u, r; x), f(u, s; y) and f(u, t; z) at least two are defined for u =v; and it fol- 
lows from (II.1.1.5) and the properties of 7, s, ¢ that those of these functions 
which are defined for u=v have the same value v’ for u=v. If we put 0f =0, 
then the function f is defined for all the elements in G,; and it follows from 
(11.1.1.2) and (11.1.1.3) that v’ is a uniquely determined element in (v) for 
every v in G,; and that for n(v)Sn—n(t), n(v‘)—n(v) =n(t)—n for 
n—n(t)<n(v). 

If v and w are any two elements (not both 0) in G,, then at least one of the 
elements x, y, z is by Corollary 1.3.4 independent of the subgroup vL+wL. 
If, for example, x and vL+wL are independent, then x and each of the three 
elements v, v-+w, ware independent so that f(v, r;x), f(v-+w, 7; x) and f(w, r; x) 
are well determined elements; and since the orders of these elements do not 
exceed the order n of x, it follows from (II.1.1.6) that (v+w)! =f(o+w, 1; x) 
=f(v, r; x) +f(w, r; x) =y'+w’; and thus f is a linear transformation meeting 
all the requirements. 

If g is any linear transformation which meets the requirements 
of (I1.1.1.8), and if v and w are two independent elements in G,, n=n(w), 
then v* is an element in p(v) such that v*=w* mod p(v—w), since v* —wé 
=(v—w) is an element in p(v—w). If x, y, z are the three elements used in 
the construction of f, then x#=f(x, 24; z)=f(x, t; z)=r, yf=s, 2# =t—by 
(11.1.1.2). If v is any element in G,, then v is 0 or independent of at least one 


of the elements x, y, 2; and if v and x are independent, then it follows from 
(11.1.1.2) that vf =f(v, r; x) =v" so that f=g. 

During the remainder of the proof it will be convenient to term a linear 
transformation f of G, into H, permissible, if x’ is for every element in G, an 
element in p(x), if S‘ is a J-subgroup of H, for every L-subgroup S of G,, 
and if there exists an integer m20 such that x‘=0 for n(x)Sm and 
n(x) —n(x‘) =m for m<n(x). 


(11.1.1.9) Every permissible linear transformation of G,, into H, is induced 
by a permissible linear transformation of Gas: into Hy41. 


If—as we may assume without loss in generality—G, <Gn41, then there 
exist at least three independent elements of order +1 in G. Let x, y be any 
pair of independent elements of order n+1 in G, let z be an element of order n 
in xL and let f be a permissible linear transformation of G, into H,. Since z 
and y are independent, (I1.1.1.7) implies the existence of an element ¢ in p(y) 
such that 2‘ =f(z, t; y). There exists by (II.1.1.8) one and only one permissible 
linear transformation g of Gn; into such that y4 =¢ (put m=n+1—n(f)); 
and as shown in the last paragraph of the proof of (II.1.1.8) we have 
zf=f(z, t; y) =z’. Since g induces a permissible linear transformation of G, 
into H,, it follows from (I1.1.1.8) that g and f coincide on G,, as was to be 
shown. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 303 


Our theorem is now an immediate consequence of (II.1.1.8) and (I1.1.1.9), 
provided the orders of the elements in G are bounded, that is, G=G,; for some 
integer j.—If the orders of the elements in G are not bounded, then our theo- 
rem is again an immediate consequence of (II.1.1.8) and (I1.1.1.9), if one 
remembers that every element in G is contained in almost every G,. 


THEOREM II.1.2. If L is a primary ring of subgroups of the abelian group G 
such that L contains either no subcycle of order n or at least three independent 
ones, then L is the set D(G; E) of all the E-admissible subgroups of G where E 
is the ring K(G; L) of all the L-admissible endomorphisms of G. 


Proof. If x is any element in yL for y an element in G, then there exists by 
Theorem II.1.1 an endomorphism f of G such that y‘=x and such that g‘ 
is in gL for every g in G. Clearly f belongs to EZ; and thus we have shown that 
dL =dE for every d in G, a fact that immediately implies L = D(G; E). 


THEOREM II.1.3. If L is a primary ring of subgroups of the abelian group G 
such that L contains either no cycles of order n or .at least three independent 
ones(**), and if J is a primary ring of subgroups of the abelian group H, then 
every projectivity of L upon J is induced by an isomorphism of G upon the whole 
group H. 


Proof. If p is a projectivity of L upon J, and if g is an element not 0 in G, 
then gl and (gL)” are cycles of equal order and there exists therefore an 
element h such that (gL)”=hJ. Since n(g)=n(h), there exists by Theorem 
II.1.1 a linear transformation gq of G into H with the following properties: 
g* =h, x* is for every element x in G an element in (xL)” such that n(x) =n(x*). 
Since therefore x* = 0 implies n(x) =0, that is, x =0, we see that q is an isomor- 
phism such that S* < S” for every L-subgroup S of G. If u is any element not 0 
in S°, then uJ is a cycle in J and there exists one and only one subcycle Z 
of Ssuch that Z? =uJ (and clearly n(Z) =n(u)). There exists in G an element y 
such that y and Z are independent and such that n(y) =(Z) =n(u) =n. Since 
yL and Z are independent cycles of order n, so are y*J and uJ; and y* —u and 
u are independent elements of order m too. There exists a cycle T of order n 
in L such that T? = (y* —u)J and one verifies that yL +Z is the direct sum of T 
and Z. Hence there exist uniquely determined elements ¢ and z in T and Z, 
respectively, such that y =t+z. Since is an element in (y* an ele- 
ment in uJ, 27—u=y*—u-—Zf" is an element in the cross-cut of uJ and 
(y* —u)J. But this cross-cut is 0, since the cross-cut of T and Z is 0; and hence 
we have shown that u=2*, S* = S$”; and this completes the proof of the theo- 
rem. 


(*) That this condition is indispensable for the validity of the theorem may be seen from 
simple examples like the groups of order a prime number (though here the theorem would 
hold true at least for auto-projectivities), or the direct sum of two abelian groups of order p 
not 2 or 3 (where the theorem would not hold true for auto-projectivities); cf. R. Baer, loc. cit., 
p. 31. 


304 REINHOLD BAER [September 


2. The ideals of the ring of endomorphisms. If L is a ring of subgroups 
of the abelian group G, then the endomorphism ¢ of G is said to be L-admissi- 
ble, if Se<.S for every subgroup S in L; and the set E=K(G; L) of all the 
L-admissible endomorphisms of G is a ring. If on the other hand a ring E of 
endomorphisms of the abelian group G has been given, then a subgroup S of G 
is termed E-admissible, if Se S for every e in EZ; and the set L=D(G; E) of 
all the Z-admissible subgroups of G is a ring of subgroups. If this ring D(G; E) 
of subgroups of G is a primary ring of subgroups (as defined in §II.1), then we 
say for short that G is primary over E; and it is the object of this section to 
characterize the rings E with primary D(G; E) by inner properties(!’); and 
to analyze the relations between the ideals in E and the subgroups (in particu- 
lar: the cycles) in D(G; EZ). With this in mind we define: the ring E is pri- 
mary if 

(i) E contains a universal unit 1; 

(ii) every right-ideal in E is two-sided; 

(iii) the two-sided ideals not 0 in E form a (finite or infinite) descending 
chain (of the order type of part of the negative integers). 

Such a primary ring E contains one and only one greatest two-sided ideal 
different from E which we shall denote by P = P(E). We derive first some sim- 
ple properties of E and P. 

(iv) An element in E possesses an inverse in E if, and only if, it is not con- 
tained in P. - 

Proof. If the element z in £ is not in P, then the right-ideal zZ is not part 
of P so that zE =E by (ii). There exists therefore an element z’ in E such that 
zz’=1. Since z’ cannot be in P, there exists likewise an element z’’ such that 
z2'2'’=1; and thus we have z =22’z’’ or 22’ = 1.—That elements in P 
do not possess inverses, is obvious. 

(v) Every two-sided ideal not 0 in E is a power of P and a principal right- 
ideal; and 0 is the cross-cut of all the P‘. 

Proof. If Q is a two-sided ideal different from 0 in Z, then there exists 
one and only one greatest two-sided ideal Q’ which is a proper part of Q. 
There exists an element in Q which is not contained in Q’; and if g is any such 


(17) G. Kéthe, Mathematische Zeitschrift, vol. 39 (1934), pp. 31-44; T. Nakayama, Bulle- 
tin of the American Mathematical Society, vol. 44 (1938), pp. 719-723; K. Asano, Japanese 
Journal of Mathematics, vol. 15 (1939), pp. 231-253; vol. 16 (1939), pp. 1-36 have treated the 
following problem: Given an abstract ring R, to find the necessary and sufficient conditions 
such that every abelian group admitting the elements in R as operators and satisfying the maxi- 
mum and minimum condition for admissible subgroups is the direct sum of cyclic admissible 
subgroups and such that every admissible subgroup of a cyclic group over R is itself cyclic 
Then it is possible to derive necessary conditions in a trivial fashion by the remark that R is 
an abelian group over R. Clearly our problem is quite different, 4s we consider a pair: abelian 
group G, ring E of endomorphisms of G; and ask for conditions on G, E assuring a satisfactory 
theory for the group G over E. Moreover we have to exclude completely those pairs G, Z where 
G is cyclic over E whereas we may omit the maximum and minimum condition for admissible 


subgroups. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 305 


element, then gE is a right-ideal contained in Q, but not in Q’ so that Q=gE. 
Thus QP =gEP =dP. If p is an element in P, then gp is an element in Q. If gp 
would not be an element in Q’, then Q=qpE so that there would exist an ele- 
ment r in E, satisfying g=gpr. Since p is in P, 1—pr is not in P and possesses 
therefore by (iv) an inverse ¢ so that g=q(1—pr)t=0, an impossibility which 
proves that OP <Q’. If x is an element in Q’, then x is in Q=gE so that there 
exists an element y satisfying x =qy. If y would not be in P, then there would 
exist—by (iv)—the inverse z of y so that g=qyz=xz would be an element in 
Q’, an impossibility which proves that Q’SQP. Thus we have shown that 
QP =Q’ of which equality (v) is an immediate consequence. 

From (iv) one infers that P =0 if, and only if, E is a (not necessarily com- 
mutative) field; and in general E/P is a field. Thus it follows from (v) that 
either P =0 or P is the one and only one prime ideal in Z. Any element p such 
that P=pE shall be called a prime in E; and if p is a prime in E, then every 
two-sided ideal different from 0 in EZ has the form P‘=p‘E. 

If the number of two-sided ideals in E is finite, then there exists a smallest 
positive integer m=m(E) such that P™=0; and if the number of two-sided 
ideals in E is infinite, then we put m(E) = «. In the first case we infer from 
(iv) that an element in E is a zero-divisor if and only if, it is contained in P; 
and in the second case it may be shown that none of the elements in E is a 
zero-divisor. 


THEOREM II.2.1. Suppose that the ring E of endomorphisms of the abelian 
group G is a primary ring. 

(1) If m(E) is finite, then the ring D(G; E) of E-admissible subgroups of G 
is a primary ring of subgroups and m(E) is the maximum order of the cycles 
in D(G; E). 

(2) D(G; E) is for infinite m(E) a primary ring of subgroups of G if, and 
only if, there exists to every element g in G an element e#0 in E such that ge=0; 
and if D(G; E) is primary, then there exist cycles of every order in D(G; E). 

(3) The order of the cycle xE in D(G; E) is 4 if, and only if, P‘ is the set of 
all the elements e in G such that xe =0. 

(4) If Z is a cycle of order n in D(G; E), and if 0Sisn, then ZP* is the 
uniquely determined subcycle of order n—i of Z (in D(G; E)) and P‘ contains 
every element ein E such that ZesZP'. 


Proof. If g is an element in G, N(g) the set of all the elements e in E such 
that ge=0, then N(g) is a right-ideal in E; and hence it follows from (ii) and 
(v) that either N(g)=0 or N(g)=P* for suitable i<m(Z£). If g#0, then 
N(g) <E and therefore N(g) <P. If eis an element in E such that gE =(ge)E 
(for g¥0 in G), then there exists an element e’ in E such that g=gee’ so 
that 1—ee’ is in N(g) and therefore in P; and this implies that neither e nor e’ 
is in P. Applying (iv) it follows now immediately that 

(*) gE =(ge)E for g<0 if, and only if, e possesses an inverse in E. 


306 REINHOLD BAER [September 


If S is an E-admissible subgroup of gE, then the set of all the elements e 
in E such that ge is in S is a right-ideal between N(g) and EZ. If J and J’ are 
right-ideals such that N(g)<J<J’, then gJ and gJ’ are E-admissible sub- 
groups such that gJ<gJ’ as follows from (ii), (iv), (v) and (*). But now it is 
clear that gE is a cycle of order i in D(G; E) if, and only if, N(g) =P‘. This 
proves (3). To derive (4) we need note now only that PN(gP) = N(g). 

From the fact that gE is a cycle of order 7 if, and only if, N(g) =P‘, we 
infer that D(G; E) is primary if, and only if, either m(Z) is finite, or m(E) is 
infinite, but there exists to every g in G an e¥0 in E such that ge=0.—If the 
maximum order of the cycles in the primary ring D(G; EZ) of subgroups of G 
is k, and if e is an element in P*, then it follows from (3) that ge=0 for every g 
in G; and this implies e=0, since e is an endomorphism of G. This completes 
the proof of (1) and (2). 


THEOREM I1.2.2. If the ring E of endomorphisms of the abelian group G 
contains the identity element 1, if the ring D(G; E) of E-admissible subgroups of 
G is primary and contains at least two independent cycles of order m, but no 
cycles of higher order, then the ring E is a primary ring and contains every 
D(G; E)-admissible endomorphism of G. 


Proof. If the E-admissible subgroup wE of G is a cycle of the maximum 
order m, and if the E-admissible subgroup vE of G is independent of uZ, 
then vE is a cycle of an order not exceeding m and the smallest E-admissible 
subgroup W of G which contains u and » is the direct sum W=uE+vE of 
these two cycles. The E-admissible subgroup (u+-1)£ is a cycle of an order not 
exceeding m in D(G; E) and W is the sum of the cycles vE and (u+v)E. If C 
is the cross-cut of (u-+v)E and vE, then uE, W/vE and (u+v)E/C are iso- 
morphic cycles; and this implies that C=0 and that (u+2)E is a cycle of 
order m which is independent of vE.—Suppose now that ¢ is an element in Z 
such that ue =0. Then (u+v)e =ve is an element in the cross-cut C of (u+v)E 
and vE; and thus it follows that ue =0 implies ve = 0, if wE is a cycle of order m, 
and if wE and vE are independent. 

Assume now that wE is a cycle of order m, that e is an element in E such 
that we =0 and that g is some element in E. If gE is independent of uwZ, then 
we have already shown that ge =0. If the cross-cut of uE and gE is different 
from 0, then there exists—by our hypothesis—an element s in G such that 
sE is a cycle of order m which is independent of uZ. Since gE is a cycle, and 
since uE and gE have their uniquely determined subcycle of order 1 in com- 
mon, it follows that sE and gE are independent. But we have shown already 
that ue =0 implies se=0 and that se=0 implies ge =0. Since e is an endomor- 
phism we have therefore proved: 

If wE is a cycle of order m, and if eis an element in E such that ue=0, 
then e=0. 

If 0Sism, then denote by (wE)‘ the uniquely determined subcycle of 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 307 


order m—1i of the cycle uE of order m and by P; the set of all the elements ¢ 
in E such that we is in (wE)‘. Clearly every P; is a right-ideal in E = Py»; and 
we have just proved that P,,=0. If J is any right-ideal in EZ, then uJ is an 
E-admissible subgroup of G so that uJ =(uE)/=uP; for some j; and since 
ue=uf for e and f in E implies e=f, it follows now that J=P; so that every 
right-ideal in E is contained in the descending chain of the m+1 right- 
ideals P;. If e is any element in E, then eP; is a right-ideal in E and is hence a 
P; too. But P;< P; would imply P;<eP;<eP;=@P;< --- contradicting the 
fact that there exists but a finite number of right-ideals between P; and E. 
Thus eP; SP; so that the P; are two-sided ideals and this shows the primarity 
of the ring E. 

Denote now by F the ring of all the D(G; E)-admissible endomorphisms of 
G. Then D(G; E) =D(G; F); and thus it follows from what we have shown 
already that ESF, that F is a primary ring and that uf =0 implies f =0, if 
uF =uE-is a cycle of order m. If w is any element in F, then uw is in uE. 
Hence there exists an element ¢ in E such that uw =e; and this implies e=w, 
since uE = uF is a cycle of order m. Thus E = F; and this completes the proof. 


THEOREM I1.2.3. If the ring E of endomorphisms of the abelian group G con- 
tains every D(G; E)-admissible endomorphism of G, if D(G; E) is a primary 
ring of subgroups of G which contains at least two independent cycles of every 
order n, then E is a primary ring with the following property("*). 

(vi) If P is the prime ideal of E, if e; is an element in P‘ fori=0,1,2,---, 
then there exists one and only one element e in E such thate—@9—-Q,— +++ —é; 
is an element in P**' (for every i). 


Proof. We denote by G(m) the smallest E-admissible subgroup of G which 
contains all the elements x in G such that xE is a cycle of an order not exceed- 
ing » (in D(G; EZ)); and we denote by P(m) the set of all the elements ¢ in E 
such that G(n)e=0. Clearly P(n) is a two-sided ideal in EZ. If x is any element 
in G(n), then x=2,+ --- +x, where the x; are elements in G such that the 
E-admissible subgroup x;£ is a cycle of an order not exceeding n in D(G; E). 
Since xE is a subcycle of the sum of the cycles x,E£, it follows from Theorem 
1.2.1 that the order of the cycle xE does not exceed m either; and thus we 
have shown that the order of the cycle xE in D(G; E) doeg not exceed n if, 
and only if, x is in G(m). 

Since every element x in G is contained in some G(m), it follows that the 
cross-cut of the descending chain of two-sided ideals P(m) is 0. If e; is for 
4=0,1,2,--- an element in P(i), and if a;=e0+ - -- +¢;, then all the endo- 
morphisms Gn41,° induce the same endomorphism 5b, in G(n). Since 
every element in G is contained in some G(n), there exists therefore one (and 
only one) endomorphism e of G which coincides with b, on G(m) (for every n). 


(#8) Because of this property (vi) the ring E may be termed a P-adic ring. 


308 REINHOLD BAER ‘ [September 


Since every a, is D(G; E)-admissible, so is e; and hence e is an element in E 
such that e—a; is in P(t+1). 

Suppose that s is any endomorphism in E. If s¥0, then there exists a 
greatest such that s is in P(m) (so that s is not in P(m+1)) since 0 is the 
cross-cut of the P(t). Suppose that ¢ is some element in P(m) (which may or 
may not be in P(m+1)). Clearly E(h) = E/P(h) is a ring of endomorphisms of 
G(h) such that D(G(h); E(h)) consists only of the Z-admissible subgroups of 
G(h). Thus it follows from Theorem II.2.2 that E(k) is a primary ring of endo- 
morphisms of G(h) whose only right-ideals are by (v) the two-sided ideals 
P(i)/P(h) =(P(1)/P(h))* for 0SiSh. Thus there exists an element s, in E 
such that ¢—ss, is an element in P(n+h) (for h=1, 2, ). Hence s(s,—5Sp41) 
is in P(n+h); and since s is in P(m), but not in P(n+1), it follows that 
(P(n+h)+s)E(n+h) =P(n)/P(n+h) and that therefore s,—Sj41=@ is in 
P(h). Hence it follows from what has been shown in the preceding paragraph 
of this proof that there exists one and only one endomorphism e in E such 
that —e; is in P(i+1). Then t—s(so—e) 
=t—ss;+s(eo+ +--+ +€;1—€) is the sum of two elements in P(m+7) so that 
t—s(so—e) is part of the cross-cut of the P(m+7) and is therefore 0. Conse- 
quently t =s(so—e); or P(m) =sE; and this shows that every right-ideal not 0 
in E is one of the two-sided ideals P(m). Hence E is a primary ring of endo- 
morphisms of G; and that E satisfies (vi) has been shown in the second para- 
graph of the proof. - 


THEOREM II.2.4. If the ring E of endomorphisms of the abelian group G con- 
tains every D(G; E)-admissible endomorphism, if D(G; E) is a primary ring of 
subgroups of G which contains either no subcycle of order n or at least two inde- 
pendent ones, then the following condition is necessary and sufficient for every 
left-ideal in E to be two-sided. 

(vii) If Uand V are E-admissible subgroups of G such that V is the sum of U 
and of a finite number of cycles in D(G; E), and if there exists at most one cycle 
of order 1 in V/U, then V/U is a cycle. 


Proof. If P = P(E) =0, then E is a field and every cycle not 0 is of order 1 
so that we may assume in the course of the proof that P#0.—Suppose first 
that every ideal in E is two-sided, and the E-admissible subgroups U and V 
meet the requirements of (vii). Then (vii) will be proved as soon as we have 
shown that there do not exist different cycles of equal order in V/U. We note 
first that the cycle (U+x)Ein V/U is of order n—on account of the preceding 
theorems—if, and only if, the cross-cut of U and xE is exactly xP". If 
(U+x)E and (U+y)E are cycles of order m in V/U, then (U+x)P*—” and 
(U+y)P*—' are cycles of order 1 in V/U and hence it follows from the require- 
ments on V and U enunciated in (vii) that (U+x)P*'=(U+y)P*—". If p is 

-any prime in E, then this implies the existence of an element r in EZ such that 
U+xp""!=U+yp*"'r; and r cannot be an element in P since xp*—' is not 


. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 309 


in U, though yp*-'P is contained in U. Since every left-ideal is assumed to 
be a right-ideal, there exists an element s in E such that sp*-'=p*—7; and s 
cannot be in P, since p*~'r is not in P*. Since therefore (x—ys)p* 
=xp"-!—yp"—'y is an element in U, ((x—ys)E+U)/U is a cycle of an order 
not exceeding n — 1. If we make now the induction hypothesis that there exists 
at most one subcycle of order »—1 of V/U, then it follows that U+(x—ys)E 
is part of U+-yE so that U+-xESU+yYE; and consequently there exists at 
most one subcycle of order of V/U. Thus (vii) is a consequence of the fact 
that all the ideals in E are two-sided. 

Suppose now that (vii) is satisfied by D(G; Z), that r is an element in EZ, 
though not in P, and that p is a prime in E (so that every right-ideal different 
from 0 in E is of the form p‘Z). If there exist cycles of order n in D(G; E), 
then there exist two independent cycles xE and yE of order n in D(G; £). If 
U=(xp—ypr)E and V=xE+yE, then USxP+yP so that V/U is not a 
cycle, since V/(xP+~yP) is the direct sum of two cycles of order 1. Since the 
cross-cut of U and xE is null, it follows that (U+xP*-)/U is a cycle of order 
1. Hence it follows from (vii) that there exists a cycle (U+vE)/U of order 1 
in V/U which is different from (U+xP*—')/U. Thus v is not in U, but vp is 
in U and the cross-cut of U+xP*"! and U+vE is exactly U. Furthermore 
there exist elements s, ¢ in E such that v=xs+-yt. If t would be in P, then 
t= pf for f in E and consequently (since r is not in P and (iv) may be applied) 


xs + ypf = 28+ 
= x(s + pr“f) — (xp — ypr)r"f 
= x(s + pr“f) mod U; 


and this is impossible, since it would imply U+-vE S$ U+-xE. Thus is not in P. 
Since vp is in U, there exists an element g in E such that xsp+yip=vp 
=(xp—ypr)g =xpg—yprg; and from the independence of xE and yE we may 
infer xsp =xpg and ytp = yprg. Since ¢t is not in P, we have yE = ytE and hence 
(yip)E=yP=(yprg)E or P=prgE; and thus g cannot be in P. Hence 
xP =xpgE =xspE or P=spE so that s cannot be in P either. Thus it follows 
from xsp=xpg that sp—pg is in P*, since n is the order of xEZ; and hence it 
follows from (iv) that pg-'=s~!p mod P*; and from ytp = yprg we obtain like- 
wise that pr=ipg-'!=ts-'p mod P*. Thus we have obtained the following in- 
termediary result. at 

(*) Ifris in E though not in P, if pis a prime in EZ, and if there exist cycles 
of order m in D(G; E), then there exists an element g(m) in E such that 
pr=q(n)p mod P*. 

If the orders of the cycles in D(G; E) are bounded, and if m is the maxi- 
mum order of the cycles in D(G; E), then P™ =0 and (*) implies pr = q(m)p.— 
If the orders of the cycles in D(G; E) are not bounded, then it follows from (*) 
that g(m)p=q(n+1)p mod P* so that g(m)=q(n+1) mod P*" for every n; 


310 REINHOLD BAER . [September 


and hence it follows from Theorem II.2.3, (vi) that there exists one and only 
one element g in EZ such that g=q(m) mod P*~'! or pr=q(n)p=gp mod P* for 
every n. Since the cross-cut of the ideals P” is 0, it follows now that pr=qp; 
and thus we have shown 

(**) If r isin E though not in P, if p is a prime in EZ, then there exists an 
element g in E such that pr=qp. 

Since every right-ideal different from 0 is a power of P and of the form 
p‘E, every element not 0 in E has the form p‘r for p a prime in E and ¢ not in 
P. If p’s is another element in this normal form, then there exist elements g, ¢ 
in E—by (**) and sp‘E = p‘E—such that pis p‘r = p/ p‘ trs~! s = p'g pis so that 
every left-ideal in EZ is a right-ideal; and this completes the proof of our theo- 
rem. 

It is a consequence of this theorem and the other theorems of this section, 
that by Theorem I.3.6 finite sums of cycles in the ring D(G; E) are completely 
splitting, primary elements in a Dedekind set, if E is a primary ring all of 
whose ideals are two-sided, and if there exist either no cycles val order m in 
D(G; E) or at least two independent ones. 

If the ideals in the primary ring E are two-sided, then it is eaaility verified 
that the subsets GP‘ of the group G are E-admissible subgroups (do not only 
generate E-admissible subgroups); and on the basis of this remark one may 
prove by the customary arguments: 


A("). If E is a primary ring of endomorphisms of the abelian group G such 


that m(E) is finite and such that all the ideals in E are two-sided, then G is a 
direct sum of cycles in D(G; E). 


B(?°). If the primary ring E of endomorphisms of the abelian group G con- 
tains every D(G; E)-admissible endomorphism of G, if every ideal in E is two- 
sided, and if D(G; E) is primary, then every E-admissible subgroup S different 
from 0 of G satisfying S=SP is a direct summand of G and is the direct sum of 
subgroups(?') in D(G; E) which contain one and only one cycle of order n for 
every n. 


In order to show the independence of condition (vii) from the other condi- 
tions we construct an example of a primary ring E not all of whose ideals are 
two-sided. 


Let F be a commutative field which possesses an isomorphism v upon a 
proper subfield F” < F. Consider the set E of all the (ordered) pairs (f, g) for f 


(#*) See, for example, R. Baer, Compositio Mathematica, vol. 1 (1934), pp. 274-275. Note 
that this Theorem A is a special case of Theorem I.5.1 above. 

(2°) See R. Baer, Bulletin of the American Mathematical Society, vol. 46 (1940), pp. 800- 
806. 

(2) H. Priifer has introduced subgroups of this type into the study of primary abelian 
groups; in analogy to Priifer’s terminology they may be called “groups of type P*” or “cycles of 
order «,” 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 311 


and g in F. Two such pairs (f, g) and (f’, g’) define the same element in E if, 
and only if, f=f’, g=g’; their sum is defined by (f, g)+(f’, g’) =(f+/’, g+2’) 
and their product by (f, g)(f’, g’) =(ff’, fg’ +2f’). One verifies readily that E 
is a ring with identity 1=(1, 0) and 0=(0, 0), that (J, g) possesses an inverse 
in £ if, and only if, f¥#0, and that therefore the only right-ideal different from 
0 and E is the set of all the elements (0, f) for f in F, since (0, f)(g, 0) = (0, fg). 
This ideal (0, F) is clearly a two-sided ideal whose square is 0 so that E is 
a primary ring. A left-ideal is formed by all the elements (0, f”) for f in F, as 
follows from the above formulas and the multiplicativity of v. Since 0 < F’ < F, 
this left-ideal is different from all the right-ideals(*). 

If in the preceding construction we would choose F as a finite field possess- 
ing an automorphism v# 1 (so that F= F”), then E would bea finite, primary, 
noncommutative ring all of whose ideals are two-sided. 

Application to the-principles of projective geometry. If L is a primary ring 
of subgroups of the abelian group G such that the orders of the cycles in L 
do not exceed 1 (so that cycles not 0 ar2 points) and such that there exist at 
least three independent cycles of order 1 in L, then it is a consequence of 
Theorem II.1.2 that L is the ring of all the EH-admissible subgroups of G 
where E£ is the ring of all the L-admissible endomorphisms of G; and it is a 
consequence of Theorem II.2.2 that the ring E is a primary ring of endomor- 
phisms of G. Since the maximum order of the cycles in L is 1, it follows from 
Theorem II.2.1 and (iv) that Z is a (not necessarily commutative) field. But 
this implies immediately that Desargues’ theorem is valid in the projective 
geometry represented by L. Since the possibility of representing the linear 
subspaces of a projective geometry by means of a ring of subgroups of an 
abelian group may be considered to be the essence of representations by 
means of homogeneous coordinates, we may state this result somewhat loosely 
as follows. 

A projective plane admits of a representation by means of homogeneous co- 
ordinates if, and only if, Desargues’ theorem holds true in it. 

3. The fundamental theorem of projectivity. If E is a ring of endomor- 
phisms of the abelian group G, and if f is an isomorphism of G upon the 
(whole) abelian group H, then there exists the inverse isomorphism f—' of f 
which maps H upon G; and mapping the endomorphism e in E upon é = f-'e f 
constitutes an isomorphism of EZ upon a ring of endomorphisms of H which 
we call the isomorphism of E induced by f. 


THEOREM II.3.1(*). If E‘is a primary ring of endomorphisms of the abelian 
group G‘ (t=1, 2), if the ring D(G‘, E*) of the E‘-admissible subgroups of G‘ is 
primary and contains either no cycles of order n or at least three independent ones, 


(#) There are many possibilities of generalizing this construction. 
(%) This theorem asserts—in the terminology of projective geometry—that every projec- 
tivity is induced by a semi-linear transformation. 


312 REINHOLD BAER : [September 


if E* contains every D(G‘, E*)-admissible endomorphism of G‘, and if p is a 
projectivity of D(G', E') upon D(G?*, E*), then there exists an isomorphism q of 
G! upon G? which induces p in D(G', E*) and which induces an isomorphism 
of upon 


The existence of an isomorphism gq of G! upon G? which induces p in 
D(G', E*) is an immediate consequence of Theorem II.1.3. Furthermore we 
know that such an isomorphism q induces an isomorphism of E' upon a ring 
F of endomorphisms of G*; and for reasons of symmetry it suffices to show 
that F< E*. Thus if ¢e is an endomorphism of G1, S an E*-admissible subgroup, - 
then Se = (Se)? =S so that is D(G*, E*)-admissible 
and therefore in E’. 

If in particular G=G!=G’, E=E'=E’, then pis a projectivity of D(G; E) 
(upon itself), g an automorphism of G and qg induces an automorphism in Z. 
If g is an element in G, ean endomorphism in E, then (ge)? = ((g%)* ‘e)* =g%e?; 
and it is easily verified that exactly those automorphisms of G which induce 
an automorphism in E induce a projectivity of D(G; E) (upon itself). In 
analogy to the distinction between linear and quasi-linear transformations we 
say that the automorphism gq of G is a proper E-automorphism(*), if q induces 
the identity in EZ, that is, if gq commutes with all the endomorphisms in E. 

We consider now—throughout this section—an abelian group G, a pri- 
mary ring E of endomorphisms of G such that every left-ideal (as well as 
right-ideal) in E is two-sided and such that the primary ring D(G; £) of sub- 
groups of G is the sum of a finite number of cycles in D(G; £). It is then a con- 
sequence of Theorem II.2.1, Theorem II.2.4—using the additional hypothesis 
that G contains at least two independent cycles of maximum order m =m(E)— 
and Theorem I.3.7 that G splits completely and is primary in the Dedekind 
set D(G; E). Consequently G is the direct sum of a finite number of cycles 
Z:1,:++, Zein D(G; E); and it follows from Lemma I.3.8 that there exists a 
cycle Z in D(G; E) which is not part of any proper partial sum of the Z;; it is 
readily verified that this is equivalent to saying thatG =Z+)_;,.:Z; for every i. 

If G is the direct sum of the cycles Z:; and of the cycles Z; in D(G; E), 
then it follows from Corollary 1.3.4 that the numbering may be effected in 
such a way that 2(Z;;) =(Zz2;) for every 7. If furthermore Z; is a cycle such 
that G=Z;+). j«sZis for every j, then our generalization of the fundamental 
theorem of projectivity may be stated as follows(?*). 


THEOREM II.3.2. There exists one and only one projectivity p of D(G; E) 
which is induced by a proper E-automorphism of G and which satisfies: Zf,=Z2;, 
Ze =Z 2. 


(*) It is customary in the theory of abelian operator groups to admit only these “proper 
E-automorphisms” as automorphisms. 
(5) If one takes into account Theorem II.3.3. below. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 313 


Proof. To prove the unicity of p we consider a proper E-automorphism f 
of G which leaves Z; and every Z,; invariant. Since the Z’s are cycles in 
D(G; E), there exists an element z in Z; such that Z,=zE; and since G is 
the direct sum of the Z;, there exist uniquely determined elements 2; in Z1, 
such that z=2;+ -- - +2. From the choice of z and Z, it follows immediately 
that Z,,=2,E. From the choice of f it follows Z,=2‘E, Z,;=2{E; and there exist 
therefore elements e, e; in E (though not in the prime ideal P of Z) such that 
se=s', But now it follows immediately that 2 =2,e; so that =ge 
for every g in G, that is, f induces the identity in D(G; EZ); and this proves 
the unicity of the required projectivity. 

To prove the existence of some projectivity meeting the requirements we 
note first that there exist elements 2;; such that Z;;=2,,;E and such that 
Zi= (fat +2u)E. Since m(21;) =n(z2;) for every j, there exists one and 
only one proper Z-automorphism gq such that 2f, =29,; and q clearly induces a 
projectivity p of D(G; E) (upon itself) such that Z2§=Z2;, Z? =Zs. 

The proper E-automorphism f of G shall be termed a perspectivity of G, 
if there exists an E-admissible direct summand F of G such that G/F is acycle 
and such that every element in F is left invariant by f. That this definition (**) 
is not too narrow may be seen from the following fact. , 


The projectivity p of D(G; E) (upon itself) is induced by a perspectivity of G, 
if there exists an E-admissible direct summand T of G such that G/T is a cycle 


and such that every E-admissible subgroup of T is left invariant by p, provided G 
contains at least three independent elements of maximum order. 


Proof. It is a consequence of Theorem II.3.1 that p is induced in D(G; E) 
by some automorphism gq of G which induces an automorphism of E£. It is a 
consequence from our general hypotheses that T possesses a basis B and B 
contains certainly two different elements of maximum order m in G. If ¢ is 
an element of order m in B, and if b is an element not ¢ in B, then /* =¢e for e 
in E though not in P, b* =be’, (b+#)* =(b+2)e” for e’, e’’ in E. Consequently 
te =te’’, be’ =be’’. Since ¢ is of order m, P™=0, we have e=e’’, b? =be’ =be’’ 
= be; and this implies x* = xe for every x in T. Since ¢ is not in P, there exists 
an inverse e~' of ¢ in E. If we put y =y%e~ for every y in G, then f and q 
induce the same projectivity p in D(G; EZ), but f is a perspectivity, since it 
leaves every element in T invariant, and since =f" 'v=(t''v)‘ =t and 
=0 imply v =v’ for v in E. 


THEOREM II.3.3. The group of proper E-automorphisms of G is generated 
by the perspectivities of G. 


(*) It is readily verified that this definition and the customary definition—postulating a 
center apart from the “axis” we postulated—coincide in the case of ordinary projective geome- 
try, provided one is able, as we are, to use Desargues’ theorem. 


314 REINHOLD BAER ? [September 


Proof(?”7). The proper E-automorphism f of G shall be termed irreducible 
(in the course of this proof), if there exists an E-admissible direct summand T 
of G with the following properties: 

(i) Every element in T is left invariant by f. 

(ii) If f is the product of the proper E-automorphisms u and vy, if the 
E-admissible direct summands U and V of G both contain T, and if u leaves 
the elements in U, v the elements in V invariant, then U=T or V=T. 

It is an obvious consequence of the maximum condition for E-admissible — 
subgroups, that every proper E-automorphism of G is the product of irreduci- 
ble proper E-automorphisms of G; and thus all we have to prove is the follow- 
ing statement. 

A proper E-automorphism of G is a perspectivity if, and only if, it is irre- 
ductble. 

The irreducibility of perspectivities is a consequence of the fact that S=G, 
if Sis an E-admissible direct summand of G such that there exists an E-admis- 
sible direct summand T of G satisfying T<.S and G/T is a cycle. 

Thus let us assume now that the proper E-automorphism f of G is irreduci- 
ble. Then there exists an E-admissible direct summand T of G such that f, 
’ T satisfy the above conditions (i), (ii). Since T is a direct summand of G, 
there exists an element 6 and an E-admissible subgroup U of G such that G 
is the direct sum of T, U and bE and such that the orders of the elements in U 
do not exceed n(b).—If the elements 6 and b’ were independent modulo T, 
then there would exist an E-admissible subgroup V of G such that G would 
be the direct sum of T, V, bE and b‘E. Since b and Df are of equal order, there 
exists one and only one proper E-automorphism v of G which leaves every 
element in 7+ V invariant and which interchanges b and 6’. This is impossible, 
since v leaves every element in the direct summand T+ V+(b+0‘)E invari- 
ant, and since fv—! leaves every element in the direct summand T+V+0E 
invariant. Thus it follows that b and bf are dependent modulo 7; and this im- 
plies that G is the direct sum of T, U and b‘E too. Consequently there exists 
one and only one proper E-automorphism u of G which leaves every element 
in T+U invariant and which maps b upon b*. Since f is irreducible, and since 
fu! leaves every element in T+0E invariant, it follows that T=T+U or 
U=0; and this implies that f is a perspectivity. 

If D(G; E) contains at least three independent cycles of maximum order, 
then it follows from Theorem 11.3.1 that every projectivity of D(G; EZ) upon 
itself is induced by a proper E-automorphism of G if, and only if, every auto- 
morphism of E is an inner automorphism. Thus we see that projective ge- 
ometry over the field of real numbers has—as far as the behaviour of 
projectivities is concerned—more in common with abelian groups of order a 
power of a prime than with projective geometry over the field of complex 


(?7) It should be noted that this proof is slightly simpler and proves more than the custom- 
ary proofs of the projective special case of this theorem. 


. 
| 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 315 


numbers, since both the field of real numbers and the ring of integers modulo a 
power of a prime admit of the identity-automorphism only, whereas the field 
of complex numbers possesses an infinity of automorphisms. 

4. Duality and the theory of characters. Throughoyt this section we make 
the following assumptions. E is a primary ring of endomorphisms of the 
abelian group G; every ideal in E is a two-sided ideal in E; G is in the ring 
D(G; E) of the E-admissible subgroups of G the sum of a finite number of 
cycles; D(G; E) contains(?*) at least two independent cycles of maximum 
order m. We note the following consequences of these hypotheses (and Theo- 
rems IJ.2.1, 11.2.4 and I.3.7): if P is the prime ideal in E (unless E is a field 
and P=0), then 0=P"< P”—"'; if S is an E-admissible subgroup of G, then S 
is the direct sum of a finite number of cycles in D(G; EZ) and G/S is the direct 
sum of a finite number of cycles in D(G/S; E). 

A character(?*) of G in E is a single-valued function f of the elements in G 
with values in E such that f(ge+g’e’) =f(g)e+f(g’)e’ for g, g’ in G and e, 
e’ in E. If f and v are characters of G in E, and if e is an element in EZ, then 
f(g) +(g) and ef(g) are characters of G in E; and thus it follows that the set 
Ch(G; E) of all the characters of G in E is an abelian group, admitting the ele- 
ments in E as left-operators. Characters and character group of Ch(G; E) 
in E are defined in a like manner, apart from certain obvious interchanges of 
right and left. 


THEOREM II.4.1. (a) G is essentially the same as the group of characters of 


Ch(G; E) in E. (b) D(G; E) and D(Ch(G; E); E) are duals of each other. 


Proof. If g is an element in G, f an element in Ch(G; EZ), then we put 
Q,(f) =f(g). It is readily verified that Q, is for every g in G a character of 
Ch(G; E) in E, that Qr4y=Q:+Qy, Qre = Q.¢ for x, y in G, ein E. To prove that 
Q, =0 implies g =0; suppose that g~0 and that B is a basis of G over E. Then 
=)» in sbe(b) for e(b) in E; and g¥0 implies that at least one be(b) If p is 
a prime in E, then there exists one and only one character s=s, of G in E 
which maps 6 upon p”-"® and all the other elements in B upon 0. Thus 
s(g) = p"-*e(b); and this is not 0, since e(b) would otherwise be an element 
in P*® so that be(b) would be 0.—Thus we have proved that an isomorphism 
of G upon a group of characters of Ch(G; E) in E is established by mapping the 
element g in G upon the character Q, of Ch(G; E). To prove that this iso- 
morphism exhausts the character group of Ch(G; E) we consider again the 
characters s, of Ch(G; E) (for 6 in some basis B of G over E). From the fact 
that every right- and every left-ideal in E is of the form Ep‘=p'E =P‘ we 
infer readily that the s, for b in B form a basis of Ch(G; EZ) over E. If visa 


(#8) This last hypothesis is only needed in order to be able to apply Theorem II.2.4. After 
this one application has been effected, this hypothesis may be dropped. 

(*) For generalizations of the concept of character, see P. Lewis, Characters of abelian 
groups, American Journal of Mathematics, vol. 64 (1942), pp. 81-105. 


316 REINHOLD BAER , [September 


character of Ch(G; E) in E, then p*s,=0 implies that v(s,) =p"-"d(b) for 
d(b) in E. If in sbd(b), then Q,(ss) =ss(g) = p"-"d(b) =v(s,) so that 
Q, =v; and this completes the proof of (a). 

If S is an E-admissible subgroup of G, then we denote by (f(S) =0) the 
set of all the characters of G in E which map S upon 0; and the analogous 
definition may be used for E-admissible subgroups S of Ch(G; E).—If S is 
an E-admissible subgroup of G, then every character of G/S in E is induced 
by one and only one character in (f(.S) =0); and since we showed in the first 
part of the proof that 0 is the only element in G/S mapped upon 0 by all the 
characters of G/S in E, it follows that(**) (f((f(S) =0)) =0) = S. Since—by 
(a)—G is the character group of Ch(G; £), the same formula holds true for 
E-admissible subgroups of Ch(G; £). But now it is readily verified that 
mapping the E-admissible subgroup S of G upon the Z-admissible subgroup 
(f(S) =0) of Ch(G; E) constitutes a biunivoque and monotonically decreasing 
correspondence (that is, a duality) between D(G; EZ) and D(Ch(G; E); E). 


THEOREM II.4.2. If D(G; E) contains at least three independent cycles of 
maximum order, then the existence of an anti-automorphism of E is a necessary 
and sufficient condition for the existence of an auto-duality of D(G; E). 


REMARK. We note that consequently not every primary abelian operator 
group is self-dual.—Since there exists an auto-duality of D(G; Z) whenever G 
is the direct sum of two cycles of order 1 in D(G; E), the assumption of the 


existence of at least three independent cycles of maximum order is indis- 
pensable(*'). 

Proof. Every element e in E induces an endomorphism e’ of Ch(G; E); 
since Ch(G, E) contains elements of order m (if n(b) =m, then the character s, 
constructed in the proof of Theorem II.4.1 is of order m) e’ =0 implies e=0; 
since (e+d)’=e’+d’, (ed)’=d’e’ for d, ein E, it follows that mapping e upon 
e’ constitutes an anti-isomorphism of E upon a ring EZ’ of endomorphisms of 
Ch(G; E). It is a consequence of Theorems II.4.1, (b), 11.2.1 and I1.2.2 that E’ 
contains every D(Ch(G; E); E)-admissible endomorphism of Ch(G; E). 

If there exists an anti-automorphism of EZ, then there exists an isomor- 
phism f of E upon £’ and there exists one and only one isomorphism v of G upon 
Ch(G; E) which satisfies b” = s, for b in a basis B of G and s, defined as in the 
proof of Theorem I1.4.1 and which satisfies furthermore (ge)” =g’e’ for g in G 
and ein EZ. Thus there exists a projectivity of D(G; Z) upon its—by Theorem 
11.4.1, (b)—dual D(Ch(G; E); E), proving the self-duality of D(G; E). 

If there exists an auto-duality of D(G; E), then there exists by Theorem 


(”) This identity is in the projective special case essentially the content of the theory of 
linear equations. 

(#) That the existence of a duality cannot be expected, unless G is the sum of a finite num- 
ber of cycles in D(G; E), has been pointed out before; cf. R. Baer, Duke Mathematical Journal, 
vol. 5 (1939), pp. 824-838. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 317 


11.4.1, (b) a projectivity of D(G; EZ) upon D(Ch(G; E); E); and hence it fol- 
lows from Theorem I1.3.1 that E and E’ are isomorphic. Since EZ and E’ have 
been shown to be anti-isomorphic in the first paragraph of the proof, this im- 
plies the existence of an anti-automorphism of E. 

5. The theorem of Pappus. Throughout this section we shall assume that 
E is a primary ring of endomorphisms of the abelian group G, that P is the . 
greatest two-sided ideal different from E in E, that the ring D(G; £) of all the 
E-admissible subgroups of G is primary and contains either no subcycles of 
order or at least three independent ones, and that E contains every D(G; E)- 
admissible endomorphism of G. 

The seven cycles U;, Uz, Us / Z / Vi, V2, Vs in D(G; E) are in Pappus 
order(**), if they are of equal order n, if Z, U:, Vi are independent, and if 


U2+Z = U3 = U2 = U2+ Us, 
Vi + Vz. 


Before stating the extension of Pappus’ theorem whose proof is the goal of 
this section, we establish a useful normal form for cycles in Pappus order. 


Lemma II.5.1. If the seven cycles U;, Us, Us / Z/ Vi, Vo, Vs of order n in 
D(G; E) are in Pappus order, if W(i, j) =W(j, 1) for 14] is the cross-cut of 
Ui+V; and U;+ Vi, then the W(i, j) are cycles of order n; and there exist three 
independent elements 2, u, v of order n in G and elements 1+-x, y in E though not 
in P such that 

U,=uE, Us = (u(1+ x) — 


Vi=vE, Ve=(0—2)E, Vs=(oy+2)£, 
W(1, 2) = (u +e — 2)E, W(2, 3) = ((v — 2)y + (u(1 + x) — sx)(1 + »))E, 
W(3, 1) = (u(1 + x) — (oy + 2)x)E. 


Proof. There exist elements z, b in G such that Z=zE, U,=bE. Then 
U2=(zr+bs)E for r, s in E. Since U2 is part of Ui+Z, but of no proper partial 
sum of this direct sum, neither r nor s can be in P—by Theorem II.2.1—so 
that r—! exists in E. Hence U,; = uwE for u= —bsr— and U2=(z—u)E. Likewise 
we find an element v such that V:=vE, V2=(z—v)E. The independence of 
2, u, v is a consequence of the independence of Z, Ui, Vi.—Since U; is part of 
the direct sum Z+ U2, but of no proper partial sum of Z+ U2, there exist ele- 


(#) This arrangement of the cycles is necessitated by typesetting limitations. The more 
suggestive form is used in (N) below. Note the asymmetry in the treatment of 1 and 2, though 
it is possible to interchange 1 and 2, provided one interchanges at the same time U and V. 
The customary form of stating the theorem of Pappus is so wide that there exists hardly a 
geometry in which it holds true and it will become apparent from the proof of Theorem II.5.2 
below that the restrictions we imposed upon the cycles Z, U;, V; are unavoidable. 


318 REINHOLD BAER 4 [September 


ments 7, s in E, though not in P, such that U;=(2r+(z—u)s)E. Let 
x= —sr~'—1. Then 1+< is not in P and U3;=(u(1+x) —zx)E.—Since V; is 
part of the direct sum Z+ Vi, but of no proper partial sum of Z+ Vi, there 
exist elements d, ¢ in E, though not in P such that V;=(2zd+vt)E. Then 
y =td-' is not in P and V3=(vy+z)E. 

The elements in W(1, 2) are of the form ur+(v—z)s =vh+(u—z)k; since 
this equation implies ur=uk, zk=zs, vs=vh, it follows that W(1, 2) 
=(u+v—2)E. 

The elements in W(2, 3) are of the form (u—z)r+(vy+z)s=(v—z)h 
+(u(1+x) —xz)k; this equation implies that r=(1+<x)k, ys=h, r—s=h+xk 
mod P", as follows from the independence of 2, u, v and of Theorem II.2.1; 
but these congruences imply h=ys, k=r—xk=h+s=(y+1)s mod P” so that 
W(2, 3) =((v—z)y+(u(1+x) —2x)(1+y))E. 

The elements in W(3, 1) are of the form ur+ (vy+z)s =vh+(u(1+x) —2x)k 
and this implies ur=u(i+x)k, vys=vh, 2s=—2xk or r=(1+x)k, ys=h, 

=-—xk mod P* so that W(3, 1) =(u(1+x) —(vy+z2)x)E. 

Since z, u, v are three independent elements of order 1, it is now clear that 
the W(i, j) are cycles of order n. 


THEOREM II.5.2. If D(G; E) contains three independent cycles of order n, 
then the commutativity of E/P” is equivalent to the validity of the nth property 
of Pappus: If the cycles Ui, U2, Us / Z / Vi, V2, Vs of order nin D(G; E) are in 
Pappus order, if W(i, j) = W(j, 1) is for ij the cross-cut of U;it+- V;and U;+ Vi, 
then 


W(1, 2) + W(2, 3) = W(2, 3) + W(3, 1) = W(3, 1) + W(A, 2). 


Proof. Suppose first that the mth property of Pappus be satisfied by the 
cycles in D(G; E), and suppose that x and y are two elements in E such that 
neither 1+ x nor y is in P. There exist in G three independent elements 2, u, v 
of order ; and the seven cycles 


U,=uE, Uz=(u—sz)E, Us = (u(1+ x) — 


Ve=(0—z)E, Vs (oy + 2)E 


are easily seen to be in Pappus order. Since consequently W(2, 3) S$ W(2, 1) 
+W(1, 3), we infer from Lemma II.5.1 the existence of elements 7, s in E 
such that 


‘y— + (u(1 + x) — + y) = (uto—z)r t+ (u(1 + x) — (vy + 2)x)s; 
and these elements 7, s must clearly satisfy the congruences 
(1+ x)(1+ y) = r+ (14+ y=r— yas, y+ x(1+ y) =r + xsmod P*. 


Subtracting the third from the first congruence, we find s=1 mod P”*, and 


Z=sE, 
» 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 319 


hence it follows by subtracting the second from the third congruence that 
xy =yx mod 

If 1+<x is in P, but y is not in P, then x is not in P; and x =1+(x—1) to- 
gether with the results already obtained imply that (x—1)y=y(x—1) mod P* 
so that xy =yx mod P", if at least one of the two elements x and y is not in P. 
If both are in P, then 1+< is not in P, so that (1+x)y=y(1+x) mod P* and 
hence xy =yx mod P"; and this proves that E/P* is a commutative ring. 

If conversely E/P* is a commutative ring, then any seven cycles of order 
in Pappus order may be assumed to be in the normal form (N) of Lemma 
II.5.1. Since y(1+) is not in P and possesses therefore an inverse in E, and 
since we derive from the commutativity of multiplication the identity 


+ x)(1 + + oy — 2(y + x(1 + y)) — (w(1 + x) — vyx — 2x) 
= u(1 + x)y + vy(1 + x) — sy(1 + x) = (u +0 — 2) y(1 + 2), 


we find immediately that W(1, 2)+W(2, 3)=W(2, 3)+W(3, 1)=W(3, 1) 
+Wé<(1, 2), that is, the mth property of Pappus is satisfied in D(G; E). 


Coro.uary II.5.3. If the nth property of Pappus is satisfied in D(G; E) 
(and if D(G; E) contains cycles of order n), then the (n—1)st property of Pappus 
is satisfied in D(G; E). 


For E/P*—" is commutative, if E/P* is commutative. 


Coro.uary II.5.4. The nth property of Pappus is satisfied in D(G; E) for 
every n tf, and only if, E is commutative. 


For E is commutative, if every E/P* is commutative, since 0 is the cross- 
cut of the P”. 

6. The ordinary primary abelian groups. We assume throughout this sec- 
tion that L is a primary ring of subgroups of the abelian group G which either 
does not contain any cycles of order n or at least three independent ones. It is 
the object of this section to find conditions on the (abstract) Dedekind set L 
which assure that L contains every subgroup of G. 

If we denote by E the ring of all the L-admissible endomorphisms of G, 
then it is a consequence of Theorem II.1.2 that L is exactly the set D(G; EZ) 
of all the E-admissible subgroups of G; and it is a consequence of Theorems 
11.2.2 and II.2.3 that E is a primary ring of endomorphisms. Thus there exists 
in E a uniquely determined greatest two-sided ideal P different from Z; and 
all the elements in Z that are not in P possess inverses in £. 

The integral multiples of the unit in E—which we denote by 0, + 1,4 2, 
—form a subring Ep of E. Clearly Eo is part of the central of E; and the cross- 
cut of Eo and P is either 0—in which case E is said to be of characteristic(*) 


() The characteristic of the ring E is exactly the characteristic of the field Z/P in the cus- 
tomary terminology. 


320 REINHOLD BAER ; [September 


0Q—or consists of all the multiples of a certain rational prime number r—in 
which case E is said to be of characteristic(*) r. 

If the set of all the subgroups of the abelian group C is a cycle of order n, 
then C is a cyclic group containing c* elements for c a rational prime number. 
If the primary ring L of subgroups of G contains every subgroup of G, then 
the characteristic of E is a rational prime number 1, cycles of order m in 
L are cyclic groups of order r*; and(*) E is the ring of all the r-adic integers, 
provided the orders of the cycles in L are not bounded; whereas E£ is the ring 
of the rational integers modulo r”, if m is the maximum order of the c” ‘es in 
L; in short: G is a primary abelian group of characteristic r. 

The six cycles W, Z;, Z2; W:, We, Z form a complete triangle of wder n 
with vertex W [and basis Z:+Zs], if 

(a) W, Z:, Zz are three independent cycles of order n in L, 

(b) Wit W4+2Z:=Z:+ 
™ W. 

A normal form for complete triangles is established by the following 
lemma. 


Lemma II.6.1. The six cycles W, Z:, Z2; Wi, We, Z form a complete triangle 
of order n if, and only if, there exist in G three independent elements w, 2, 2 
of order n such that W=wE, Z;=2:E, W;=(w+2,E, 


Proof. The sufficiency of the condition is readily verified—Thus we as- 
sume that the six cycles form a complete triangle. There exists an element w 
such that W=wE. Since W is part of Z;+W;, but of no proper part of this 
sum, there exist elements 2;, w; such that w=w;—z;, W;=w.E, Z;=2,E. Since 
W, 2, 2, are independent elements of order m, and since Z is part of 21:+Z2 
and W,+ Wz, but of no proper part of these sums, one may readily verify that 

If the six cycles W, Z;, Z2; Wi, We, Z form a complete triangle of order n 
in L, then we define derived elements W‘?, WY inductively for j= —1, 0, 
1, 2,---+ by the following rules. 

(-1) 
(#1), () 
W is the cross-cut of W and W, + W;, ; 
v‘**” is the cross-cut of wi? + Z and wy” 
wy? is the cross-cut of 


W: 


(—1) 


+ 
+Z,andZ,;+ w'*, 
is the cross-cut of wit? +Z andZ:+ 


G+ 1) 


In particular we call W‘* the jth derivative of the vertex of the complete triangle. 
Lemna II.6.2. If the six cycles W, Z:, Z2; Wi, We, Z form a complete triangle 
(*) Cf. R. Baer, American Journal of Mathematics, vol. 59 (1937), p. 110, Theorem 5.2. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 321 


of order n in L, and if w, 2; are elements in G such that W=wE, Z;=2,E, 
W;:=(w+2,)E, Z=(%,—2)E, then the derived. elements satisfy: W‘? =(w7)E, 
WY =(s;—w/)E. 


Proof (by induction). Our contention is obvious for j= —1 and thus we 
assume it to be true for j in order to prove it for j+1. Using the fact that 
W, %, , are three independent elements of order n, one verifies: 

W+» is the cross-cut of wE and (w+2:)E+(s:—wj)E and hence equals 

is the cross-cut of w(j+1)E+(a—2)E and 

and hence equals (2:—2.—w(j+1))E; 

W¥*» is the cross-cut of (g:—22.—w(j+1))E+mE and .E+w(j+1)E and 

hence equals (%:—w(j+1))E; 

is the cross-cut of and #E+w(j+1)E 

and hence equals (2—w(j+1))E; 
and this completes the proof of the lemma. 


THEOREM II.6.3. If the primary ring L of subgroups of the abelian group G 
contains either no cycle of order n or at least three independent ones, then the 
following conditions are necessary and sufficient for L to contain every subgroup 
of G. 

(i) If the subgroup S in L is the direct sum of two cycles of order 1 in L, 
then the number of subgroups in L that are part of S is r+3 for r a rational 
prime number. 

(ii) If W is the vertex of some complete triangle of order n in L, then there 
exists a superscript j such that the jth derivative W of Wis the subcycle of order 
n—1 of W. ; 

If (i) and (ii) are satisfied, then G is a primary abelian group of character- 
istic fr. 


Proof. The necessity of condition (i) is an obvious consequence of the 
fact that G is a primary abelian group of characteristic r, if L contains every 
subgroup; and the neceSsity of (ii) is immediately derived from Lemmas 
11.6.1 and I1.6.2.—Assume conversely that conditions (i) and (ii) are satisfied 
by L. If L contains cycles of order n different from 0, then L contains three 
independent elements w, 2, 2 of order n. The six cycles W=wE, Z,;=2E, 
Wi=(wt+a)E, Z=(2%:—2%)E form a complete tri- 
angle of order n. Since the subcycle of order »—1 of W is—by Theorem 
II.2.1—just WP, it follows from condition (ii) and from Lemma II.6.2 that 
there exists an integer j such that WP =wjE; and this implies that P=jE 
for j in Eo. ; 

It is a consequence of condition (i) that E/P is a prime field of prime 
number characteristic r so that E/P* consists of exactly r* elements, provided 
there exist—as we assume just now—cycles of order n in L. If we denote by P» 


322 REINHOLD BAER ° [September 


the cross-cut of P and Eo, then it follows from P =jE for j in Eo, that Eo/Pf 
consists of r” elements too so that E/P* and Eo/P% are essentially the same. 
If G, is the sum of all the cycles of an order not exceeding n in L, then it fol- 
lows from the fact that L contains every E-admissible subgroup of G, that 
every subgroup of G, belongs to L, since every subgroup of G, is Eo-admissi- 
ble, therefore E/P*-admissible, therefore E-admissible. Since every element 
in G is contained in some Gy, it follows finally that every subgroup of G be- 
longs to L and that G is a primary abelian group of characteristic r. 

We add some remarks. If S is any subset of the ring L of subgroups of G, 
then we denote by N(S) the met determined by S, that is, the smallest subset 
of L which contains S and which contains with any two elements their sum 
and their cross-cut. If, for example, S consists of the six cycles of a complete 
triangle, then one may prove in a similar fashion as Theorem II.6.3 that N(S) 
contains every part of the sum of the cycles of the triangle, provided L con- 
tains every subgroup of G. 

Applying Lemmas II.6.1 and II.6.2 one verifies immediately the following 
characterization of the characteristic of E. 

Suppose that W is the vertex of a complete triangle of order +0 in L. 
Then the characteristic of E is 0 if, and only if, every derivative of the vertex 
W is equal to W; and the characteristic of E is the rational prime number r if, 
and only if, the rth derivative W“” of Wis a proper subcycle of W (possibly 0). 


Part III. CONSTRUCTION OF THE UNDERLYING GROUP 


In the first part a class of Dedekind sets was determined which exhibited 
the salient features of both projective spaces and finite abelian groups. In the 
second part we introduced the primary abelian operator groups as those oper- 
ator groups whose sets of admissible subgroups just met the requirements 
postulated in the first part. In this part we are going to prove that the (ab- 
stract) Dedekind sets with these properties may always be realized as the sets 
of admissible subgroups of a primary abelian operator group, provided they 
are “big enough.” This last restriction is not surprising, considering the im- 
possibility of obtaining a complete theory in the projective plane—there does 
not exist a planar proof of Desargues’ theorem. The problem of determining 
the minimum number of parameters necessary for the validity of our theorem 
is still an open one; it is clear that it cannot be less than four and we prove 
that it is at most six. 

The discussion im this part is based on the results of the two preceding 
parts. The method used is rather different from those customarily employed 
in dealing with similar problems of projective geometry, though an extension 
of Desargues’ theorem is of central importance—and this part has some in- 
trinsic interest apart from its applications. But in projective geometry it is 
customary to construct first the field of coordinates and then to introduce the 
linear subspaces by means of linear equations. Considering that the coordi- 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 323 


nates are just the operators operating on the underlying group, this amounts 
to constructing the operators first and the group on which they operate only 
afterward. We invert this order of procedure, construct first the group with 
a set of distinguished subgroups representing the given Dedekind set and 
obtain operators as well as primarity as an application of theorems in Part II. 
Apart from this difference in the order of procedure one may say that the 
main difference consists of the true projectivity of our method and of the 
complete avoidance of affine means—the affine method consisting in first find- 
ing a representation for all the elements outside a certain distinguished hyper- 
plane, a representation which is extended afterwards over the whole space, 
the projective method avoiding this preferential treatment of some element 
and thus finding representations of all the elements (as subgroups) at the same 
time. 

1. Collinearity. The three elements u, v, w in the Dedekind set D are said 
to be collinear, if u+-v=0+w=w+u. 


Lemma III.1.1. If u-+-w=w-+, then u, v and w(u+v) are collinear. 
For it follows from Dedekind’s law that 
= (ut = u+ w(u + 2) 
=(9 + w)(utrv) = 0 + 


Lemma III.1.2. If there exists an element e such that a, c, e and b, d, e are 
triplets of collinear elements, then both (a+-b)(c+d), c, d and (a+6)(c+d), a, b 
are triplets of collinear elements. 


This follows from Lemma III.1.1, since for example, 
at(c+ ad +5. 


2. The theorem of Desargues. The elements w(i, 7) =w(j, 7) for 17 are 
termed connecting links of the two triplets u(i) and v(t)—i = 1, 2, 3—of elements 
in the Dedekind set D, if u(t), w(i, 7), u(j) and v(t), w(t, 7), v(j) are two 
triplets of collinear elements for every 1+#j. 


THEOREM III.2.1. If the elements w(i, j)are connecting links of the two triplets 
u(t) and v(t), and if v(3)(u(1)+u(2)+u(3)) =0, then the w(i, j) are collinear. 


Proof. If i, 7, 4 is any permutation of the three numbers 1, 2, 3, then it 
follows from the definition of connecting links that 


u(1) + u(2) + u(3) = w(i, 7) + w(j, h) + u(3), 
v(1) + 0(2) + 0(3) = w(i,7) + w(j, h) + 0(3); 


and hence it follows from the hypothesis and Dedekind’s law that 


REINHOLD BAER [September 


w(i,j) + w(j, h) = wi, + + 0(3)(u(1) + u(2) + u(3)) 
= (w(i, j) + w(j, h) + 0(3))(u(1) + u(2) + u(3)) 
= (0(1) + 0(2) + 0(3))(u(1), + + (3), 


proving the collinearity of the connecting links w(i, j). 

The seven cycles u(1), u(2), u(3) / 2 / v(1), v(2), v(3) ate in Desargues 
order, if 

(a) 20, n(u(t)) Sn(z), m(v(t)) Sn(z) for 2, 3, 

(b) u(t), v(t), are collinear for 2, 3, 

(c) +0(j)) =0 for j(*). 

Since a cycle c#0 contains one and only one subcycle c* of order 1—a no- 
tation we shall use throughout—condition (c) is equivalent to the following 
handier condition 

(c’) 2(u(t)+u(j)) =0 or 2(v(4)+0(7)) =0 for 147. 

From these conditions we derive the following helpful statement 

(d) u(t)v(¢) =0; and if 2(u(¢)+u(j)) =0, then n(v(z)) =n(z) and (u(t) 
+u(j)) =0. 

If zu(i)=0, then 2, (b)—and 
v(t) /(u(i)v(t)) are isomorphic cycles; and hence it follows from (a) that 
u(t)v(t)=0 and n(v(t)) =n(z).—If =0, though v(7)(u(t) +u(7)) 
then v(t) #0 and v(7)* Su(i)+u(j). If 0, then it follows from 
u(t)v(t) =0 and Theorem I.2.2 that 2* Su(i)*+v(i)* Su(i)+u(j), contradict- 
ing our assumption. Thus u(t) =0; and this implies v(¢)* = u(j)* and v(¢) =z— 
by (b)—so that 2* =u(j)*, an impossibility which proves (d). 

(e) The elements w(t, j) = are the only connecting 
links of the two triples u(¢) and v(¢). 

That the w(t, 7) are connecting links is an immediate consequence of 
Lemma III.1.2.—If the elements x(i, 7) are connecting links too, and if, for 
example, 2(u(i)+u(j)) =0, then v(¢)(u(¢)+u(j)) =0 and hence 

a(i, j) = x(t, + + u(j)) = 7) + v(4)) (u(t) + u(j)) 
= (v(j) + v(4))(u(t) + u(7)) = 
by Dedekind’s law. 

(f) If s+u(1)+«(2)+«(3) splits completely, then the connecting links 
w(t, j) are cycles. 

If w(t, 7) would not be a cycle, then it would follow from Theorem I.3.6 


that w(i, j) contains at least two different subcycles of order.1. Hence u(z) <0, 
u(i)* Sv(t)+0(j) and #0, v(¢)* Su(é)+u(j), contradicting (c’) and (d). 


DESARGUES’ PROPERTY. the seven cycles u(1), u(2), u(3) /2/v(1), 0(2), 


(3) It may be seen from trivial examples that this last condition is indispensable, though it 
seems to be fairly common to omit it. 


324 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 325 


are in Desargues order, then their connecting links w(2, 3), w(3, 1), w(1, 2) are 
collinear. 


We note that this property refers to all cycles in a certain Dedekind set. 


THEOREM III.2.2. If the element w splits completely, is primary and contains 
at least five independent cycles of maximum order, then Desargues’ property is 
satisfied by the cycles in the Dedekind set w/u for every part u of w. 


Proof (**). We show first that Desargues’ property is satisfied by the sub- 
cycles of w; and we shall derive the general property from the special case. 

Thus let us assume that the seven subcycles z, u(t), v(t) of w are in 
Desargues order; and put n(z) =m. Since z, u(z), v(t) are collinear for every i, 
we find that s=z+4u(1)+u(2)+u(3) =2+0(1)+0(2)+0(3). Since there exist 
at least five independent subcycles of order m of w, and since s is a sum of four 
cycles only, it follows that there exists a subcycle » of w whose order is m 
and which is independent of s. 

Since pz=0, and since and z are cycles of equal order m, it follows from 
Theorem 1.3.7 and Lemma I.3.8 that there exists a subcycle g of +2 ‘which 
is not part of any proper partial sum of +2 and that p+z is the direct sum 
of p and g, of z and q, of g and p, n(q) =m. 

We prove next that the elements r(i) = (p+u(t))(q+v(¢)) are cycles. 

For if this would not be true, then 7(¢) would contain by Theorem I.3.6 
two different subcycles of order 1; and since both p+u(i) and g+v(t) are 
sums of two cycles, this would imply that g*<p+u(t), p*Sq+v(t) and - 
(p+ u(4))* = (¢+v(¢))*—where we denote by r* the sum of all the subcycles 
of order 1 of the element r, a notation which we shall use throughout. Asa 
consequence of (c’) we have zu(z) =0 or 2v(i) =0. In the first case we obtain: 
(p+u(t))* = p*+q* = p*+2* =2*+u(i)*, an inference contradicting sp=0; 
and in the same way we obtain a contradiction from 2v(i) =0 so that the r(¢) 
have to be cycles. 

Since z, p,q and z, u(z), v(4) and z, u(j), v(j) are three triplets of collinear 
elements, it follows from Lemma III.1.2 that the elements w(i, 7), r(7), r(¢) are 
connecting links of the two triplets p, u(i), u(j) and gq, v(¢), v(j)—note that 
the elements w(i, 7) =(u(z)-+u(j))(v(4)+0(j)) are the connecting. links of the 
triplets u(t) and v(t). From (c’) it follows that not both 2(u(i)+u(j)) and 
2(v(t)+0(7)) can be different from 0. If 2(u(i)+u(j)) =0, then it follows from 
p(s+u(i)+u(j))=0 and from Dedekind’s law that 0=(p+2)(u(z)+4(j)) 
= (p+q)(u(é)+u(j)); and this together with pq = 0 implies 0 = ¢(p+u(i)+4u(j)); 
and if 2(v(t)+(j)) =0, then we prove likewise that 0 =¢(p+-(¢)+0(j)). Thus 
it follows in either case from Theorem III.2.1 that w(¢, 7), r(¢) and r(j) are 
collinear. 


(®) This is an adaptation of the proofs of Desargues’ theorem as given in projective ge- 
ometry; cf., for example, Veblen and Young, Projective Geometry, Boston, 1910, p. 41. 


326 REINHOLD BAER 


From ps =0, p+ u(t) and Dedekind’s law we infer that 
r(i)(u(1) + (2) + u(3)) = r(i)(p + u(i))(u(1) + u(2) + u(3)) 
= 1(i)(u(i) + p(u(1) + W(2) + u(3))) = r(i)u(i) 
and likewise it follows that 
r(i)(v(1) + + 0(3)) = r(i)o(i). 


Since r(z) is a cycle, and since u(i)v(4) =0—by (d)—it is impossible that both 
and r(z)(v(1)+0(2)+0(3)) are different from 0. If 
=0, then the connecting links w(i, 7) of the triplets 
and r(z) are collinear by Theorem III.2.1; and if r(z)(v(1)+0(2)+0(3)) =0, 
then the connecting links w(i, 7) of the triplets v(¢) and r(t) are collinear by 
Theorem III.2.1 so that the connecting links w(i, 7) of the triplets u(z) and 
v(4) are collinear in any case. 

If u is a part of w, and if u(1), u(2), u(3) / z / v(1), v(2), v(3) are cycles in 
Desargues order in the Dedekind set w/u of all the elements between u and w 
(whose null-element is u), then s =2s+0(1)+0(2)+0(3). 
Since the parts of w are sums of cycles by Theorem I.3.7 there exist subcycles 
c, c(4) of w(in D) such that z= u+cand u(t) =u+c(i). If t=c+c(1)+c¢(2)+c(3), 
then s=u+¢# and s/u and ¢/(tu) are isomorphic. Thus it suffices to prove that 
the cycles in the Dedekind set ¢/(tu) satisfy Desargues’ property. Since ¢ is a 
sum of four cycles, and since there exist at least five independent cycles of 
maximum order in w, there exists a subcycle p of w such that pt =0 and such 
that n(p) =n(z/u). Now it is clear that the system (p++/)/(tu) meets—as far 
as the subcycles of t/(tu) are concerned—all the requirements we needed in 
the first part of the proof; and this completes the proof of our theorem. 

3. The vectors. A complete n—m-simplex S = Sf consists of n independent 
cycles c(1),---, c(m) of order m—the vertices of S—together with cycles 
c(t, 7) =c(j, 14) for i#j—the links of S—subject to the following conditions. 

(i) c(t), c(j) and c(4, 7) are collinear. 

(ii) c(i, j), cj, k) and c(k, 7) are collinear, if 7, 7, k are three different in- 
tegers. 

Since c(z) and c(j) are for i#j independent cycles of the same order m, and 
since c(i, j) is a cycle, the order of c(i, 7) must be m too and c(i, j)c(k) =0 
for every k. 


Lemma III.3.1. If the cycles c(1),---, c(m) of order m are independent, if 
>°7.1c(é) splits completely and is cides and if 4<xn, then the c(i) are the ver- 
tices of some complete n—m-simplex. 


Proof. Since c(1)+c(t) is for 1<i the direct sum of two cycles of equal 
order m, and is at the same time primary, it follows from Lemma I.3.8 that 
there exists a cycle c(1, 7) =c(z, 1) of order m such that c(1), c(t) and c(1, 4) 
are collinear. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 327 


If i, 7, k are three different integers which are all different from 1, then 
c(1)(c(t)+c(j)+c(k)) =0 so that the seven cycles 


c(1, 4), c(1, 7), k) / / 


are in Desargues order; and their connecting links c(j, k), c(i, k), c(t, 7) are col- 
linear by Theorem III.2.2 where c(i, j) = (c(¢)+c(j))(c(1, 4) +c(1, 7)) is by (e), 
(f) of the preceding section a cycle (of order m); and hence a complete 
n—m-simplex is formed by the vertices c(i) and the links c(i, 7). 

If the cycles c(i) are the vertices and the cycles c(i, 7) the links of a com- 
plete n—m-simplex S, then a vector V over S determines—and is itself de- 
‘ termined by—the cycle c(V) generated by V and the coordinates (i, V) for 
i=1,---+, m subject to the following rules. 

(a) (4, V) = © df, and only if, c(t) c(V) #0. 

(b) n(c(V)) Sm. 

(c) If c(t) c(V) =0, then (i, V) is a cycle and c(t), c(V), (4, V) are collinear. 

(d) If c(t) c(V) =c(j) c(V) =0, 14), then (i, V), (j, V), c(t, j) are collinear. 

An immediate consequence of m(c(V)) <m(c(t)) and (c) is the following 
statement. 


(e) V) c(V) =0; n((@, V))=m. 


THEOREM III.3.2. If S is an n—m-simplex with vertices c(i) and links 
c(i, j), if if cis a cycle and n(c) Sm, if is primary and splits 
completely, if c c(k) =0, if d is a cycle such that c, d, c(k) are collinear, then there 
exists one and only one vector V over S such that c=c(V) and d=(k, V). 


Proof. We may assume without loss in generality that k=1.—Sup- 
pose first that V and U are vectors over S such that c(V) =c=c(U) and 
(1, V)=d=(1, U). There exists at most one 4 such that c(1)(c+c(i)) is 
not 0. If c(1)(c+c(j)) =O0=c(1)(c+c(k)) for 7#k, then the seven cycles 
c, c(j), c(k) / c(1) / d, c(1, 7), c(1, k) are in Desargues order; and it follows 
from (e), (f) of the preceding section that their connecting links are uniquely 
determined and are c(j, k), (j, *), (k, *); and hence it follows from (c) in the 
definition of a vector that (j, *)=(j, U)=(, V), (k, *)=(k, V) =(k, U).— 
If finally cc(i) =0, but c(1)(c+c(z)) =0, then it follows from similar reasoning 
that for some j different from 1 and 7 


(i, V) = + V) + 
= (c + c(4))((j, U) + = (4, U) 
so that U=V. 

In order to prove the existence of a vector V over S meeting all the re- 
quirements we proceed in a similar fashion. There exists at most one 1~1 
such that c(1)(c+c(i)) 0. If 741 and c(1)(c+c(j)) =0, then we put (j, V) 
=(c+c(j))(d+c(1, j)). If 74k and c(1)(c+c(j)) =0=c(1)(c+c(k)), then the 
seven cycles d, c(1, 7), c(1, k) / c(1) / ¢, c(j), c(k) are in Desargues order and 


328 REINHOLD BAER [September 


their connecting links are c(i, k), (k, V), (j, V). Hence it follows from (f) of 
the preceding section that (j, V), (k, V) are cycles, meeting the requirements 
(c) and (d)of the definition of a vector, as follows from (e) of the preceding sec- 
tion and from Theorem III.2.2. If we put c=c(V), d=(1, V), then the defini- 
tion of the desired vector V has been completed, if either c(1)(c+c(¢)) =0 for 
every or c(1)(c+c(4)) 0 for an such that c(t)c0, as we have to 
put (i, V) = © in this case. 

In order to complete the proof we assume now that cc(m)=0 and 
c(1)(c+c(m)) #0. But then c(t)(c+c(n))=0 for 1<i<mn; we put (n, V); 
=(c+c(n))((t, V)+c(t, m)). Then we may prove as before that (n, V); 
is a cycle, that c, c(m), (m, V); as well.as (4, V), c(t, m), (m, V); are collin- 
ear. If furthermore, 1Sj<n, j#%, then the seven cycles c(i, m), c(i, 7), 
(i, V) / c(t) / c(m), c(j), c=c(V) are in Desargues order; their connecting 
links (j, V), (m, V);, c(m, j) are collinear by Theorem III.2.2. But this implies 
(n, V);S(m, V); so that (n, V);=(n, V); for reasons of symmetry. If we put 
now (n, V)=(n, =(m, V)a1, then this completes again the defini- 
tion of the required vector. 

If S is a complete »—m-simplex, 4<n, if c is a cycle such that n(c) Sm 
and the sum of c and of the vertices is primary and splits completely, then 
there exists a vertex c(k) of S such that cc(k) =0; and it follows from the pri- 
marity by Lemma I.3.8 that there exists a cycle d such that c, d, c(k) are col- 
linear; and hence we have proved the following corollary to Theorem III.3.2. 

There exists a vector V over S which generates the cycle c. 

4. Subtraction of vectors. Throughout this section we shall assume that 
the element g (in the Dedekind set D) is primary and splits completely, that S 
is a complete »—m-simplex whose vertices c(i) and whose links c(i, 7) are 
parts of g and that 5<m, though for some of the following results it would 
suffice to assume 4 <n. 


LemMa III.4.1. If A and B are vectors over S such that c(A) and c(B) 
are parts of g, and if c(h)(c(A)+c(B))=0=(c(A)+c(B))c(k), then (c(A) 
+c(B))((h, A)+(h, B)) and (c(A)+c(B))((k, A)+(k, B)) are equal cycles 
and c(A), c(B), (c(A)+c(B))((h, A)+(h, B)) as well as (h, A), (h, B), 
(c(A)+c(B))((h, A) +(h, B)) are collinear triplets. 


Proof. We note first the existence of a j such that c(j)(c(A)+c(B)+c(h) 
+c(k)) =0, and that it suffices clearly to prove our statement for the couple 
h, j instead of proving it for the couple h, k. Since c(h)(c(A)+c(B)) =0, it fol- 
lows that c(j)+c(h)+c(A)+c(B) is the direct sum of c(A)+c(B), c(h), c(j). 
Since c(i), c(X), (¢, X) are collinear whenever c(i)c(X) =0, it follows that the 
cycles (h, A), (h, B), c(j, h) / c(h) / c(A), c(B), c(j) are in Desargues order; 
and their connecting links are the—by (e), (f) of §III. 2—uniquely determined 
cycles (j, B), (j, A), (4, A)+(h, B))(c(A)+c(B)) which are collinear by Theo- 
rem III.2.2. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 329 


On account of this lemma we define: 

If A and B are vectors over S such that c(A) and c(B) are parts of g, then 
(A, B)=(c(A)+c(B))((4, A)+(, B)) for every 4 such that c(t)(c(A)+c(B)) 
=0. 

We note that (A, B) =(B, A) is a cycle such that both c(A), c(B), (A, B) 
and (4, A), (4, B), (A, B) are collinear triplets. This cycle (A, B) shall serve 
later as the cycle generated by the difference of A and B. For the definition 
of the coordinates of the difference vector we need two auxiliary functions. 


LemMA III.4.2. If A and B are vectors over S such that c(A) and c(B) are 
parts of g, then 

(a) (c(A)+c(B))(c(t)+c(j)) =0 and imply 

that w(i/j; A, B) = (c(t, j)+(A, B))((¢, A) +(j, B)) ts a cycle of order m, 

that (4, A), (j, B), w(t/j; A, B) are collinear and their sum is the direct sum 
of any two of them, 

that c(i, j), (A, B), w(t/j; A, B) are collinear and that their sum is both the 
direct sum of c(t, j) and (A, B) and the direct sum of (A, B) and w(i/j; A, B); 

(b) (c(A)+c(B)) +c(h)) =(c(A) +c(B))(c(h) 
+c(i)) =0 for three different integers 1, j, h implies the collinearity of w(i/j;A, B), 
w(h/j; A, B) and c(i, h). 

The formula defining w(i/j; A, B) is meaningful if, and only if, c(i)c(A) 
=c(B)c(j) =0. But we shall consider this function only, if 


(c(A) +¢(B))(c(¢) +c()) =0. 


Proof. If (c(A)+c(B))(c(¢)+c(j)) =0 and 1+j, then it follows from prop- 
erty (d) of the vector definition and from Lemma III.4.1 that the two triplets 
c(i, j), (t, A), (j, A) and (A, B), (j, B), (j, A) are collinear; and hence it fol- 
lows from Lemma III.1.2 that the triplets (4, A), (j, B), w(t/j; A, B) and 
c(i, 7), (A, B), w(t/j; A, B) are collinear too. 

Since it follows from our hypothesis and Lemma III.4.1 that 0=(c(A) 
+c(B))(c(t)+c(j)) = ((A, B)+c(A)) (c(t) j)), and since c(A)(+, A) =0 by 
(e) of the preceding section, it follows from Lemma I.1.1 that 

(i, A)w(i/j; A, B) = (4, A)(c(4) + c(A))(c(i, 7) + (A, B)) w(i/j; A, B) 
= (i, A)(c(i)c(i, j) + c(A)(A, B)) w(i/j; A, B) 
= (i, A)c(A)(A, B)w(i/j; A, B) = 0; 
and likewise it follows that 
A)G, B) = (4, A)(c(4) + + (B))G, B) 
= (i, + c(A)e(B))(j, B) 
= (i, A)o(A)c(B)(j, B) = 0. 


330 REINHOLD - [September 


The sum of the collinear triplet (i, A), (7, B), w(t/j; A, B) is therefore the di- 
rect sum of the two cycles (i, A) and (j, B) of order m; and (i, A) w(t/j; A, B) 
=0 implies therefore that w(i/j; A, B) cannot contain more than one sub- 
cycle of order 1. Thus it follows from Theorem I.3.7 that w(i/j; A, B) isa 
cycle; and the remainder of (a) is readily proved. 

In order to prove (b) we assume first in addition to the special hypotheses 
of (b) that 0=c(A)(c(4)+c(j)+c(h)). Then it follows from Lemma I.1.1 and 


(c(A)+(A, B))(cG)+c(4, j)) =0 that 
(j, A)(c(t, j) + (A, B)) = (J, + 7) + (A, B)) 
= (j, A)(c(j)e(i, + c(A)(A, B)) 
= (j, A)c(A)(A, B) = 0 
and likewise that (j, A)(c(h, 7) +(A, B)) =0; and it follows from Dedekind’s 
law and c(j) (c(t, 7) +c(j, 4)) =0 that 
(j, A)(c(4, j) + c(j, h)) 
= (j, A)(c(j) + c(A)) (c(i) + + c(h)) (c(i, j) + c(j, h)) 
= (j, A)(c(j) + (A) (c(t) + + c(h)))(c(i, 7) + c(j, h)) 
= (j, 7) + h)) = 0. 


Hence we have shown that the seven cycles c(i, j), c(h, j), (A, B) / (j, A) / (4, A), 
(h, A), (j, B) are in Desargues order; and it follows from part (a) of this lemma 


that their uniquely determined connecting links are the cycles w(h/j; A, B), 
w(i/j; A, B), c(i, h) and that their collinearity is a consequence of Theorem 
III.2.2. 

To derive the general case of (b) from the special case already proved we 
note that c(A)+c(B)+c(i)+c(j)+c(h) is a sum of five cycles; and hence it 
follows from 5<™m and from Corollary 1.3.4 that there exists an integer k, 
different from 4, 7, h, such that 


c(k)(c(A) + c(B) + c(i) + c(7) + c(h)) = 0. 
Thus it follows from (c(A)+c(B))(c(4)+c(j)) =0 that 
(c(A) + ¢(B))(c(i) + + c(k)) = 0; 
and likewise that 
(c(A) + c(B))(cG) + c(h) + c(k)) = 0, 
(c(A) + ¢(B))(c(h) + c(i) + c(k)) = 0; 


and hence it follows from what we have shown in the preceding. paragraph 
that w(i/j; A, B), w(k/j; A, B), c(i, k) and w(k/j; A, B), w(h/j; A, B), c(k, h) 
are collinear triplets. 

Using (a) and Lemma I.1.1 it follows that 


1942) PROJECTIVE SPACES AND ABELIAN GROUPS 


(k, A)(w(k/j; A, B) + 
= (k,A)(c(k) + c(A))((A, B) + c(h, j) + c(h, h))(w(k/7;.4, B) + h)) 
= (k, A)(c(A)(A, B) + c(k)(c(k, 7) + c(k, h)))(w(k/j; A, B) + clk, h)) 
= (k, A)c(A)(A, B)(w(k/j; A, B) + c(k, h)) = 0 
and likewise that 
(k, A)(w(k/j; A, B) + c(k, t)) = 0; 


and since (k, A)(c(k, h)+c(k, «)) =0 may be shown as before, it follows that the 
seven cycles w(k/j; A, B), c(k, h), c(k, i) / (k, A) / (j, B), (h, A), (4, A) are in 
Desargues order; and hence it follows from’ Theorem III.2.2 that their 
uniquely determined connecting links c(i, h), w(i/j; A, B), w(h/j; A, B) 
are collinear, completing the proof of (b). 

The following two remarks will simplify the handling of the function 
w(i/j; A, B). It is a consequence of its defining equation and of (A, B) =(B, A) 
that 

w(i/j; A, B) = w(j/i; B, A); 


and hence it follows under the hypotheses of Lemma III.4.2, (b) that 
w(i/j; A, B), w(t/h; A, B), c(j, h) are collinear. 


Lemma III.4.3. If A and B are vectors over S such that c(A) and c(B) are 
parts of g, then 


(a) and ixj imply that 2(i/j; A, B) 
=(w(i/j; A, B)+c(j))(c(t)+(A, B)) is a cycle of order m, 

that 2(4/j; A, B), w(i/j; A, B), c(j) are collinear and their sum is the direct 
sum of any two of them, 

that 2(i/j; A, B), (A, B), c(t) are collinear and their sum is both the direct 
sum of 2(i/j7; A, B) and (A, B) and the direct sum of (A, B) and c(t), 

that 2(i/j7; A, B), (i, A), c(B) are collinear; 

(b) = (c(A) +c(B)) (c(t) +(h)) =0 for three differ- 
ent integers i, j, h imply 2(i/j; A, B) =2(t/h; A, B). 


Proof. Since it follows from Lemma III.4.2 and the definition of a complete 
simplex that the triplets w(i/j; A, B), (A, B), c(i, j) and c(j), c(é), c(t, j) are 
collinear, we may infer from Lemma III.1.2 that the triplets 2(¢/j; A, B), 
w(i/j; A, B), c(j) and 2(i/j; A, B), c(t), (A, B) are collinear. Since c(A)+c(B) 
+c(i)+c(j) is the direct sum of c(A)+c(B), c(i) and c(j), it follows from 
(A, B)Sc(A)+c(B) and 2(t/j; A, B) <c(t)+(A, B) that 2(¢/j; A, B)c(j) =0; 
and similarly we see that w(i/j; A, B)c(j) =0, From these facts one derives as 
usual all the statements of (a) except the last one. This last statement is a 
consequence of Theorem III.2.1, since 2(i/j; A, B), (4, A), c(B) are connecting 
. links of the two triplets c(A), (A, B), c(i) and (j,B), c(j), w(t/j; A, B), and 
since c(j)(c(A)+(A, B)+c(¢)) =0, as has been pointed out before. 


332 REINHOLD BAER , [September 


In order to prove (b) we assume first that—in addition to the other hy- 
potheses—(c(A )+c¢(B))(c(#) +c) +c(h)) =0. Then h)((A, B)+c(j, 4)) =0 
=c(i, h)((A, B)+c(4)); and thus the seven cycles c(j, 7), c(t), (A, B) / c(é, h)/ 
c(j, h), c(h), w(t/h; A, B) are in Desargues order, as follows from Lemma 
III.4.2 and the properties of a complete simplex. Thus their connecting links 
are uniquely determined ; they are—by Lemma III.4.2—the cycles 2(i/h; A, B), 
w(i/j; A, B), c(j); and the collinearity of these cycles is a consequence of 
Theorem III.2.2. Thus 


2(i/h; A, B) S (w(i/j; A, B) + + (A, B)) = A, B); 


and this inequality implies equality, since every 2(--- ) has been shown to 
be a cycle of order m. 

To derive the general case of (b) from the special case we have proved just 
now, we note first that there exists an integer k such that c(k)(c(i)+c(j) 
+c(h)+c(A)+c(B)) =0, since 5<n, since c(i)+c(j)+c(h)+c(A)+c(B) is a 
sum of five cycles and therefore by Corollary 1.3.4 a direct sum of at most 
five cycles. Hence it follows from our hypothesis that (c(A)+c¢(B))(c(4) +c(j) 
+c(k)) =0 = (c(A)+c(B))(c(t)+c(h)+c(k)); and from what has been shown 
already it follows that 


2(i/j; A, B) = 2(i/k; A, B) = 2(i/h; A, B), 


as was to be shown. - 


If A and B are vectors over S such that (A) and c(B) are parts of g, then 
there exist integers 4 such that c(t)(c(A)+c(B)) =0; and to every # satisfying 
this condition there exist integers ji such that (c(t)-+c(j))(c(A)+c¢(B)) =0. 
It is a consequence of Lemma III.4.3, (b) that the cycle 2(i/j; A, B) is inde- 
pendent of the choice of j; and thus the following definition is well determined. 


DEFINITION. If c(t)(c(A)+c(B)) =0, then A, B) =2(t/j; A, B) for every 
such that (c(i) +c(j))(c(A)+c(B)) =0. 


LemMA III.4.4. If A and B are vectors over S such that c(A)+c(B) Sg; and 
if c(t)(c(A)+c(B)) =c(j)(c(A)+c(B)) =0 for ij, then 2(i; A, B), 2(j; A, B) 
and c(i, j) are collinear. 


Proof. From 5 <” we infer as usual the existence of an integer h such that 
c(h) (c(t) +c(j) +c(A)+c¢(B)) =0. From the hypothesis of this lemma it follows 
now that (c(t)+c(h))(c(A)+c(B)) =0=(c(j)+c(h))(c(A)+c(B)) so that 
2(i;A, B) =2(t/h; A, B), 2(j; A, B) =2(j/h; A, B). The three cycles z(i; A, B), 
2(j; A, B), c(i, j) are therefore connecting links of the two triplets c(j), c(i), 
(A, B) and w(j/h; A, B), w(i/h; A, B), c(h), as follows from Lemma III.4.2, 
(b) and Lemma III.4.3, (a); and the collinearity of these connecting links is a 
consequence of Theorem III.2.1, since c(h)(c(i)+c(j)+(A, B)) =0. 

If A and B are any two vectors over S such that c(A)+c(B) Sg, then there 
exist integers 1~j such that (c(A)+c(B))(c(i)+c(j)) =0. Thus (A, B) isa 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 333 


well determined cycle (contained in c(A)+c(B) so that n((A, B))<m by 
Theorem I.2.1) by Lemma III.4.1 and the cycles 2(¢; A, B) and 2(j; A, B) 
are well determined by Lemma III.4.3, (b). It follows finally from Lemma 
III.4.4 and Theorem III.3.2 that there exists one and only one vector D over S 
such that c(D) = (A, B) and such that (¢, D) =2(4; A, B) for every i such that 
c(t)(c(A)+c(B)) =0; and this vector D over S which is uniquely determined 
by A and B shall be termed the difference A —B of A and B. 

There exists one and only one vector 0 over S satisfying c(0)=0 and 
(t, 0) =c(t) for every 7; and one verifies readily that 0 is the only vector V 
over S such that c(V) =0. 


Lemma III.4.5. If A is a vector over S such that c(A) Sg, then 
(i) A—A=0, 
(ii) A=A—O, 

(iii) A =0—(0—A). 


Proof. There exist integers 1#j such that c(A)(c(¢)+c(j)) =0. Thus it fol- 
lows from Lemma III.4.1 that c(A —A) =(A, A) =c(A) (4, A) =0 by (e) of §3 
of this part and this implies A —-A =0 by Lemma III.4.3. It follows further- 
more from Lemma III.4.1 and (c) of §3 that 


c(A — 0) = (A, 0) = (c(A) + 0)((i, A) + c(t)) = c(A)((i, A) + c(4)) = (A); 
and hence it follows from Lemma III.4.3 that 
(i, A — 0) = 2(i; A, 0) = 2(i/j; A, 0) = (c(i) + c(A))(c() + w(i/j; A, 0)) 
= (c(i) + c(A))(c(j) + (c(t, 7) + c(A))((4, A) + 
= (c(i) + c(A))(c(j) + c(i, j) + c(A))((i, A) + c(y)) 
= (c(i) + c(A))((i, A) + = (i,A4) or A-—O=A. 
To prove (iii) we note that c(0—A) =(0, A) =(A, 0) =c(A —0) =c(A) and 
that therefore w(i/j; 0, A) = (c(t, 7) +c(A))(c(i)+(, A)), Sc(t, j)+c(A) 


=c(i, j)+w(i/j; 0, A) by Lemma III.4.2, (a) and (4, O—A) =2(i/j; 0, A) 
= (c(é)+c(A))(c(j)+-w(é/j; 0, A)) so that 


c(0 — (0 — A)) = (0,0 — A) = (0 + ¢(A))(c(a) + (i, 0 — A)) 
= ¢(A)(c(t) + (c(i) + c(A))(c(j) + w(i/j; 0, A))) 
= c(A)(c(i) + + w(i/j; 0, A))(c(i) + c(A)) = 


by Dedekind’s law and Lemma III.4.1; and consequently it follows from 
Lemmas III.4.1, I11.4.3 and III.4.4 that 


w(i/j; 0,0 — A) = (c(i, + e(A))(c(t) + 0 — A)) 
= (c(t, j) + (A))(c(i) + 2(7/i; 0, A)) 
= (c(i, j) + c(A))(c(i) + 0, A)) 
= (c(i, j) + (A, 0))(c(i) + w(i/j;A,0)) = w(i/j; A, 0) 


334 REINHOLD BAER 


and 
(i, 0 — (0 — A)) = (c(t) + c(A))(c(j) + w(i/j; 0, 0 — A)) 
= (c(t) + c(A))(c(j) + w(i/j;A,0)) = 2(i/j;A,0) = (i, A) 
by (ii); and hence 0O—(0—A) =A. 


Lemma III.4.6. If A, B, Care vectors over S such that s=c(A)+c(B)+c(C) 
is contained in g, then (A —B)—C=(A—C)—B. 


The proof of this associative law of subtraction will be effected in a num- 
ber of steps. 

(1) c(A—B), c(B—C), c(C—A) are collinear. 

There exists an integer 1 such that sc(i)=0. Then the seven cycles 
(i, A), (i, B), (i, C) / c(t) / c(A), c(B), c(C) are in Desargues order and their 
uniquely determined connecting links c(B—C), c(C—A), c(A—B)—by 
Lemma III.4.1—are collinear by Theorem III.2.2. 

(2) If (c(t)+c(j))s =0 for i+j, then the two triplets c(B—C), w(i/j; A, C), 
w(i/j; A, B) and c(A—B), w(i/j; A, C), w(i/j; B, C) are collinear. 

The seven cycles c(A —B), c(A—C), c(i, j) / (j, A) / Gj, B), G, ©), (t, A) 
are in Desargues order by Lemma III.4.1 and the definition of vectors, since 


(j, A)(c(A — B) + (A — C) + c(i, S G, + (c(i, J) + 5) 
S (j, A)(c(A) + + c(i, 7))) 
= (j, A)c(A) = 0; 


their connecting links are by (1) and Lemma III.4.2 the cycles w(i/j; A, C), 
w(i/j; A, B), c(B—C) and their collinearity is a consequence of Theorem 
III.2.2.—The collinearity of the second triplet is immediately inferred from 
the collinearity of the first triplet, if one remembers that w(i/j; X, Y) 
=w(j/t; Y, X). 

(3) If c(i)s=0, then the two triplets (i, A—B), (i, A—C), c(B—C) and 
(it, A—C), (4, B—C), c(A —B) are collinear. 

There exists some ji such thai s(c(i)+c(j)) =0. Then the cycles of the 
first triplet are connecting links of the two triplets c(A —C), c(A—B), c(t) 
and w(i/j; A, C), w(i/j; A, B), c(j), as follows from (1), (2), Lemma III.4.3; 
and thus the collinearity of the first triplet is a consequence of Theorem 
III.2.1, since c(j)(c(4)+c(A —C)+c(A —B)) Sc(j)(c(t) +s) =0.—The cycles of 
the second triplet are likewise connecting links of the triplets (¢, B), (4, A), 
c(C) and w(i/j; B, C), w(t/j; A, C), c(j); and the collinearity of the second 
triplet follows from Theorem III.2.1, since c(j)(c(C)+(é, A)+(é, B)) 
Sc(j)(c(t) +s) =0. 

Since 5 <n, there exist three different integers i, 7, h such that (c(¢)+c(j) 
+c(h))s =0; and these three integers shall be kept fixed throughout the re- 
mainder of the proof. 


[September 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 335 


(4) w(t/j; B, C), w(j/h; C, A), w(h/i; A, B) are collinear. 

One verifies readily that c(A—B)(j, C)=(c(A—B)+(j, C))(é, B) 
= (c(A —B)+(j, C)+ (4, B))(h, A) =0. It follows from (2) and Lemma III.4.2 
that w(i/j; B, C), w(j/h; C, A), w(h/t; A, B) are connecting links of the two 
triplets (h, A), (4, B), Gj, C) and c(h, i), c(A—B), w(i/j; A, C); and (4) is 
now a consequence of Theorem III.2.1, since the above equations imply 
c(A —B)((h, A)+ (4, B)+G, C)) =0. 

We introduce now three auxiliary vectors. If X is any of the vectors 
A, B, C under consideration, then X, is the—by Theorem III.3.2 uniquely 
determined—vector satisfying c(X,)=(h, 0—X), X,)=w(h/t; 0, X). It 
is an immediate consequence of Lemmas III.4.2 and III.4.3 that (j, Xn) 
=w(h/j; 0, X). We are not able to say anything concerning the h-coordinate 
of this vector X;. 

We note furthermore that (c(t)-+c(j))(c(An) +c(Ba) +c(Ch)) = (c(t) 
((h, O—A)+(h, O—B)+(h, O—C)) S (c(t) +c(j)) (c(h) +5) =0 so that the state- 
ments (2) and (3) may be applied upon Ay, By, C, too. 

(5) If X and Y are two of the vectors A, B, C, then X -Y=Xi,— Yj. 

It is a consequence of s(c(t)+<(j)+c(h)) =0 and of (2), (3) that the two 
triplets c(X — Y), (h,0—X), (h, O— Y) and c(X — Y), w(h/i; 0, X), w(h/i; 0, Y) 
are collinear triplets. Since furthermore 


(h, 0 — X)(w(h/i; 0, X) + w(h/i; 0, Y)) 


S (h, 0 — X)(c(h) + ¢(X)) (c(h, t) + (X) + 


S (h, 0 — X)e(X) = 0, 


it follows from Dedekind’s law and the definition of the cycle generated by the 
difference of vectors (cf. Lemma III.4.1) that 


c(Xn — Ya) = ((h, X) + (h, 0 — Y))(w(h/i; 0, Y) + w(h/i; 0, X)) 
= ¢(X — Y) + (h, 0 — X)(w(h/i; 0, Y) + w(h/i; 0, X)) 
= ¢(X — 


It is a consequence of (4) that the cycles w(i/j; X, Y), w(j/h; Y, 0), 
w(h/i; 0, X) are collinear; and it is a consequence of Lemma III.4.2 that 
the cycles c(i, j), c(X—Y), w(i/j; X, Y) are collinear. Thus it follows that 
w(i/j; X, Y)S(cli, 0, X)+wlh/j; 0, Y))=(cG, 
+c(Xn— Xn) +(4, =w(t/j; Xn, Yn); and this inequality implies 
equality, since the cycles w(i/j; - - - ) are of order m. From the equalities thus 
obtained, it follows that 2(i/j; X, Y) =2(t/j; Xn, Yn) (cf. Lemma III.4.3); and 
hence (1, X — Y)=(4, X,— and X — Y=X,i,— Y;, is now a consequence of 
Theorem ITII.3.2. 

(6): (¢, A—C), (j, A—B), w(i/j; B, C) are collinear. 

It follows from (2) that w(t/j; Bn, 0), —0)—by Lemma III.4.5, 
(ii)—and w(i/j; By, Cy) are collinear; and from (3) that (4, A,— Bz), c(Bi—Ch), 


336 REINHOLD BAER’ - [September 


(i, An—C}) are collinear. The three cycles (i, Ax), (j, An— Bu), w(i/j; Bn, 0) are 
collinear, since they are—by Lemmas III.4.1 to III.4.3—connecting links of 
the two triplets c(B,), c(t, 7), (j, An) and c(j), (4, Br), c(An—By), and since 
Theorem III.2.1 may be applied as a consequence of 


+ (4, Bs) + c(An — 
= (h, 0 — B)(c(j) + w(h/i; 0, B) + c(A — B)) 
S (h, 0 — B)(c(h) + c(B))(c(j) + c(h, i) + c(A) + c(B)) 
= (h, 0 — B)c(B) = 0. 


Since c(B,)=(h, 0—B) is a cycle of order m, and since c(B,)(c(t, 7) 
+ c(B, — Cr) + An — = — B)(cG,j) + — C) + A —B)) 
< (h, O—B)(c(t)+c(j)+c(A)+c(B)+c(C)) =0, it follows that the seven cycles 
w(i/j; By, 0), c(Ch), (i, An) f c(By) / c(t, c(B,—Cy), (i, An—Br) are in 
Desargues order, that (i, (j, An—Bz), w(i/j; Bs, Cy) are their 
uniquely determined connecting links; and (6) is now a consequence of Theo- 
rem III.2.2 and of (5). 

(7) c((A—B)—C) =c((A —C)—B). 

The three cycles c((A —B)—C), (t, A—C), (4, B)a are collinear, since they 
are—by (b)—connecting links of the two triplets (i, A), c(A—B), c(C) and 
w(i/j; B, C), (j, C), (j, A —B), and since Theorem III.2.1 may be applied as a 
consequence of (j, C)(@, A)-+c(A —B)+c(C)) SG, C)(c(t)+s)) =0. 


c(i)s =0 implies that the seven cycles (i, A), (i, A —B), (4, C) / c(t) / ¢(A), 
c(A—B), c(C) are in Desargues order; and it follows from Theorem III.2.2 
that their connecting links c((A —B)—C), c(A —C), c(B) are collinear. Con- 
sequently we find that 


c((A — B) —C) (cA — C) + (B))((i, A — C) + (4, B)) = c((A — C) — B) 


and the symmetry of our hypotheses on B and C implies the opposite inequal- 
ity, proving the desired equation (7). 

(8) c(h) =c(A —Aj). 

Since c(h) S$ ((h, O—A)+c(A))(w(h/i; 0, A) A)) =c(A,—A) as a con- 
sequence of Lemmas III.4.1 to III.4.4, the equation (8) may be inferred from . 
the fact that n(c(A —As)) Sm=n(c(h)). 

(9) c(h), c(An—X), c(A —X) and c(h), (k, An—X), (Rk, A—X) are collinear 
triplets for X = B, C and k =i, j. 

This is an immediate consequence of (8), (4) and (1) 

(10) c(h), c((Ax—B)—C), c((A —B)—C) are collinear. 

For they are connecting links of the seven cycles (i, C), (¢, As—B), 
(t, A—B) / c(i) / c(C), c(An—B), c(A—B) which are in Desargues order, 
since c(t)(c(A —B)+c(A,—B)+c(C)) Sc(t)(c(h) +s) =0, and since therefore 
Theorem III.2.2 may be applied. 

(11) c(h), w(t/j; Ax—B, C), w(i/j7; A—B, C) are collinear. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 337 


Since (4, C) is a cycle of order m, since (i, C)(c(i, j)+c¢((A—B)—C) 
+c((A,—B)—C)) S (4, C)(c(t, 7) +c(h) +5) =0, it follows that the seven cycles 
c(t, j), c((A —B)—C), c((An—B)—C) / (é, C) / G, C), A—B), Bs—B) 
are in Desargues order so that their connecting links—by (10) and (9)— 
c(h), w(t/j; Ax—B, C), w(i/j; A—B, C) are collinear as a consequence of 
Theorem III.2.2. 

(12) (j, B), (4, A—C), w(t/j; A—B, C) are collinear. 

The seven cycles c(B—C), (j, C), An—B) / (i, C) / B), ci, 9), 
c((A,—B)—C) are in Desargues order, since (7, C) is a cycle of order m, and 
since (4, C)(c(B — C) + G, C) + An — B)) C)(s + + G, An)) 
0, A)) = (4, =0. Since the cycles 
w(i/j; An—B, C), (¢, An—C), (j, B) are by (3) and (7) their connecting links, 
it follows from Theorem III.2.2 that they are collinear. Using this fact and 
(9), (11) we find that 


c(h) + w(i/j; A — B,C) + (j, B) = c(h) + w(i/j; An — B,C) + (j, B) 
= c(h) + (i, A —C) + w(i/j; A — B,C) 

= c(h) + (i,An—C) + w(i/j; An — B,C). 
Since c(h)((j, B) + (4, A —C)+w(t/j; A —B, C)) Sc(h)(s+c(j)+c(4)) =0, these 
identities imply that 

(j, B) + (4, A — C) + w(i/j; A — B,C) 
= (j, B) + (i, A —C) 

as was to be shown. 

It follows now from (12) and (7) and Lemmas III.4.1 to III.4.4 that 
w(t/j;A —B,C) S((j,B)+(4,A —C))(c(t, j) +c((A — C) —B)) = w(t/j;A —C,B); 
thus w(i/j; A —B, C) =w(i/j; A—C, B), since they are both cycles of order m. 
But now it follows from (7) that 2(4/7; A —B, C) =2(i/j; A —C, B) and that con- 
sequently 2(4; A —B, C)=2(t; A—C, B) or (4, (A—B)—C) =(4, (A—C)—B); 


and (A—B)—C=(A—C)-—B is now a consequence of (7) and Theorem 
III.3.2. 


Lemma III.4.7. If tSg, then the set (S; t) of all the vectors V over S which 
satisfy c(V) St is an abelian group (under the definition of subtraction introduced 
in connection with Lemma II1.4.4). 


Proof. If the vectors A and B over S belong to (S; #), then it follows from 
Lemmas III.4.4 and III.4.1 that A —B is a uniquely determined vector over S 
which satisfies: c(A —B) Sc(A)+c(B) St; and thus (S; #) contains with any 
two vectors over S their uniquely determined difference. It is a consequence 


wre 


338 REINHOLD BAER : [September 


of Lemmas III.4.5 and III.4.6 that this subtraction in (S; ¢) satisfies further- 
more the following rules. 

(A) There exists one and only one element 0 in (S; ¢) satisfying O=A—A, 
A=A-—0=0-—(0—A) for every A in (S; #). 

(B) If A, B, C are elements in (S; ¢), then (A—B)—C=(A—C)-—B. 

To prove that (S; ¢) is an abelian group, we define addition(*"): 


A+B=0-—((0—A)-—B) 


for A and B in (S; 

It is obvious that the sum of two elements in (5; ¢) isa uniquely determined 
element in (S; ¢); and the commutativity of addition is a consequence of (B). 
Next one infers from these rules that 


A + (B— A) = ((0— A) — (B— A)) = 0 (B= A)) A) 
= (((B B) — (B— A)) — A) 
= 0— (((B— (B— A)) — B) A) 
= 0 (((B (B — A)) — A) B) 
= 0— (((B — A) — (B— A)) — B) =0-— (0— B) = B, 


so that B—A is a solution of the equation A+ X =B. Finally one verifies the 
associative law of addition as follows 
A+(B+C) 
= 0— ((0— A) — (B+C)) = 0 — ((0— A) — (0— ((0— B) — C))) 
= 0-— ((0 — (0 — ((0 — B) —C))) — A) = 0 — (((0 — B) — C) — A) 
= 0 — (((0— B) — A) —C) = 0— (((0— A) — B) —C) 
= 0— ((0 — (0 — ((0— A) — B))) —C) =0— (0 (A + B)) —C) 
= (A + B)+C; 


and this completes the proof of Lemma III.4.7. 


LEMMA III.4.8. If t<usg, and if the maximum order of the subcycles of u 
does not exceed m, then there exists a vector V over S such that c(V) Su, though 
c(V) ts not part of t. 


Proof. Since the parts of g are sums of cycles, there exists a cycle z such 
that Su, though z is not part of ¢. From our hypothesis it follows that 


/ 


(*") See in this context the following treatments of the postulates of subtraction in abelian 
groups: M. Ward, these Transactions, vol. 32 (1930), pp. 520-526; D. G. Rabinow, American 
Journal of Mathematics, vol. 59 (1937), pp; 211-224, 385-392; B. A. Bernstein, these Transac- 
tions, vol. 43 (1938), pp. 1-6. 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 339 


n(z) Sm. But we proved as a corollary to Theorem III.3.2 the existence of a 
vector V over S such that c( V) =z; and this is just the statement to be proved. 


Lemma III.4.9. If +--+, are subcycles of g, and if V is a vector over S 
such that c(V)S2i+ +--+ +2, then there exist vectors V; over S such that 
c(V;) Sa; and V=Vit--- (using addition as in Lemma II1.4.7). 


For the proof of this theorem we shall need the following lemma. 


LemMa~ III.4.10. If u and v are cycles such that uv =0 and such that u+v is 
primary and splits completely, and if the elements u, v, t are collinear, then there 
exists a subcycle d of t such that u, v, d are collinear. 


Proof of Lemma III.4.10. If t=u-+v, then follows from Lemma I.3.8 
the existence of a cycle d <# which is not a subcycle of any proper partial sum 
of u-+v and 4, v, d are clearly collinear. 

Thus we assume now that t<u+v; and we may assume without loss in 
generality that 0<(u) Sn(v) =k. Since (u+v)/u is a cycle of order k, there 
exists between u and u+v an element r such that (u+)/r is a cycle of order 1. 
Since t+u=u-+1, it follows that tr does not hold; and since ¢ is a sum of 
cycles, there exists a subcycle d of t which is not part of r. Thus d+u Sr does 
not hold and this implies d+u=u-+v. Since the orders of the subcycles of 
u+v do not exceed k—by Theorem I.2.1—and since (u+v)/u is a cycle of 
order k, it follows that n(d) =k and du =0. Since ¢ splits as a part of u+v, and 
since d is a subcycle of maximum order of ¢ (and u+1), ¢ is the direct sum of 
dand of acycle eof order j; and t<u+v=u-+d implies j <m(u). The cross-cut 
dv is a cycle of order 1; and d+v is—as before—the direct sum of v and of a 
cycle z of order k—i. Since u+v=e+d+v=e+2+2, and since n(e) <n(u), 
it follows from Corollary I.3.4 that k-i=n(z){m(u) or n(u) Sn(z) and hence 
d+v=z+v=u-+v is a consequence of Corollary I.3.4 so that u, v, d are 
collinear. 

Proof of Lemma II1I.4.9. Since Lemma III.4.9 is certainly true for k =1, we 
may assume that it holds true for vectors W such that c(W)<s ea is ;=s, 
Since is a cycle, there exists a uniquely determined subcycle of such 
that s+c(V) =s+z2—note c(V)Ss+2. It follows from Lemma III.1.1 that 
c(V), and s(c(V)+2) =¢ are collinear. From 5 < we infer the existence of an 
integer 4 such that c(i)(s+c(V))=0. The ith coordinate (i, V) of.the vector V 
is therefore a well determined cycle. If r= (c(t)+2)(t+(z, V)), then it follows 
from Lemma III.1.2 that the triplets 7, c(i), 2 and r, ¢, (4, V) are collinear, 
since the triplets c(i), (¢, V), c(V) and gz, #, c(V) are collinear. Since c(i)z =0, 
it follows from Lemma III.4.10 that there exists a subcycle d of r such that 
d, c(t), are collinear. By Theorem III.3.2 there exists one and only one vector 
B such that c(B) =z, (¢, B) =d. 

We note c(B) < x. Since furthermore 


340 REINHOLD BAER_  - [September 


c(V — B) = (c(V) + c(B))((i, V) + (i, B)) = (c(V) + 2)((i, V) + d) 
S (c(V) + 2)((i, V) + 17) = (c(V) + V) 
=t+ (i, V)((V) + 2) = 


V —B isa vector such that c(V—B) S2:+ - + - +21; and hence there exist by 
- our induction hypothesis vectors V; such that c(V;) Sz; for i=1,---,k—1 
and such that V—-B=V,+ --- +Vi-10r 


VaVit +++ 


as was to be shown. 

5. The existence of primary abelian operator groups. The term primary 
abelian operator group shall signify an abelian group G and a ring E of endo- 
morphisms of G satisfying the following conditions. 

(a) Every right-ideal and every left-ideal in E is two-sided. 

(b) There exists an ideal P different from E in E such that every ideal 
different from 0 and E in E is a power of P. 

(c) E contains every D(G; E)-admissible endomorphism of G (where 
D(G; E) denotes the Dedekind set of all the E-admissible subgroups of G). 

(d) To every element x in G there exists a positive integer 4 such that 
xPi=0, 

Suppose now that the element g of some partially ordered set has the 
property. 

(D, 6) If g contains one subcycle of order m, then g tontains at least six 
independent subcycles of order n. 

It is an immediate consequence of Theorem II.3.1 that the parts of such 
an element g form the system of admissible subgroups of essentially at most 
one primary abelian operator group. 


THEOREM III.5.1. If the element g in a partially ordered set satisfies (D, 6), 
then the following conditions are necessary and sufficient for the parts of g to be 
the admissible subgroups of some primary abelian operator group. 

(A) The parts of g form a Dedekind set. 

(B) Sums of a finite number of subcycles of g split completely and are pri- 
mary. 

(C) If M is a nonvacuous set of subcycles of g and contains every cycle z 
which is contained in the sum of a finite number of cycles in M, then there exists 
one and only one part s(M) of g such that M is exactly the set of subcycles of s(M). 


Proof. The necessity of (A) is a well known fact in the theory of abelian 
operator groups, the necessity of (B) is a consequence of Theorems II.2.1, 
II.2.4 and I.3.7, and the necessity of (C) is a consequence of the fact that by 
Theorem II.2.1 every element in a primary abelian operator group generates 
a cycle. 

To prove the sufficiency of (A), (B), (C) we show first that: 


1942) PROJECTIVE SPACES AND ABELIAN GROUPS 341 


(+) The validity of (A), (B),(C) (and (D, 6)) implies the existence of a pri- 
mary ring Do of subgroups of an abelian group G and of a projectivity of the set 
of parts of g upon Do. 

Case 1. The orders of the subcycles of g are bounded. 

Then we denote by m the maximum order of the subcycles of g. By (D, 6) 
there exist 6 independent subcycles of order m of g; and hence it follows from 
(A), (B) and Lemma III.3.1 that there exists a complete 6-m-simplex S with 
vertices and links parts of g. If xg, then we denote by (5S; x) the set of all 
those vectors V over S which satisfy c(V) Sx. It is a consequence of Lemma 
III.4.7 that (S;x) is an abelian group (with regard to the subtraction and addi- 
tion introduced in (II1.4.4)) whenever x is the sum of a finite number of cycles. 
From this fact one infers immediately that every (S; x) for x Sg is an abelian 
group and moreover a subgroup of the abelian group G=(S; g). If x and y 
are different parts of g, then it follows from (C) that one of them, say x, 
contains a cycle which the other one does not contain; and thus follows 
from Lemma III.4.8 the existence of a vector V such that c(V) Sx though 
c( V) Sy does not hold, that is, (S; x) #(S; y) is a consequence of x ¥ y. But this 
fact puts into evidence that mapping the part x of g upon the subgroup (S; x) 
of G constitutes a projectivity of the Dedekind set of the parts of g upon 
the system Do of subgroups (S; x) of the abelian group G.—If xg, then 
denote by M(x) the set of all the subcycles of x. It follows from (C) that 
x=s(M(x)); and if M is a set of subcycles of g, meeting the requirements 
of (C), then M=M(s(M)). If J is any set of parts of g, then let H be the 
cross-cut of all the sets M(x) for x in J. Clearly s(J) is the greatest part 
of g which is contained in all the x in J; and (S; s(J)) is the cross-cut of 
all the (S; x) for x in J. Denote furthermore by K the set of all the cycles 
which are contained in the sum of a finite number of cycles from the set 
M(x) for x in J. This set K meets the requirements of (C) and thus s(K) 
is a well determined part of g. Clearly (S; s(K)) contains all the subgroups 
(S; x) for x in J; and it follows from the construction of K and from Lemma 
III.4.9 that (S; s(K)) is exactly the subgroup of G which is generated by 
all the subgroups (S; x) for x in J. Thus Do has been shown to be a ring of 
subgroups of G. If V is any vector over S, then the smallest subgroup of G 
in D containing V is just (S; c(V)); and this is a cycle in the set Do, since the 
map of x upon (5S; x) has been shown to be-a projectivity. Thus Dy is a pri- 
mary ring of subgroups; and this completes the proof of (+) in Case 1. 

Case 2. The orders of the subcycles of g are not bounded. 

Then denote by M(4) the set of all the subcycles of g whose order does not 
exceed 4, It is a consequence of Theorem I.2.1 that M(i) meets the require- 
ments of (C); and thus there exists a well determined part g(é) =s(M(7)) of g 
which contains every subcycle of order not exceeding 4 and none of higher 
order. Since the orders of the subcycles of g are not bounded, it follows from 
(D, 6) that g(#) contains at least six independent subcycles of order i (the 


342 REINHOLD BAER . [September 


maximum order in g(i)). Since g(t) Sg satisfies (with g) the conditions (A), 
(B), (C) it follows from Case 1 that there exists an abelian group Q(#), a 
primary ring D(z) of subgroups of Q(7) and a projectivity q(7) of the Dedekind 
set of the parts of g(t) upon the system D(i) of subgroups. Then q(i)~'q(i+1) 
is a projectivity of D(z) upon the subgroups of g(i)?“+ in D(i+1); and it is 
a consequence of Theorem II.1.3 that this projectivity is induced by an isomor- 
phism of upon the subgroup of Q(¢+1) (in D(t+1)). Using these 
facts one immediately constructs an abelian group G, subgroups G; of G, a 
primary ring T; of subgroups of G; and a projectivity p; of the Dedekind set 
of the parts of g(i) upon the ring 7; of subgroups, meeting the following re- 
quirements. 

(i) pi and pis: coincide on the parts of g(i). 

(ii) Every element in G is contained in some G;. 

If S; is a subgroup of 7;, and if S;SSj4:, then there exists one and only 
one subgroup S of G which contains all the elements in the S; and no further 
elements; and the set Dy of all these subgroups of G is clearly a primary ring, 
the smallest primary ring containing all the 7;.—If x is any part of g, then it 
follows from (C) that x is completely determined by the products xg(i); and 
if x(z) Sg(i), x(t) Sx(i+1), then the existence of a smallest part x of g, con- 
taining all the x(z), is readily inferred from (C). Thus it follows easily that 
there exists one and only one projectivity p of the set of parts of g upon Dy 
which coincides with p; on the parts of g(i); and this completes the proof of 
(+) in Case 2. 

If G is an abelian group and Dy a primary ring of subgroups of G such 
that there exists a projectivity of the parts of g upon Do, then it follows from 
(D, 6) and Theorem II.1.2 that Dp is the ring D(G; E) of all the E-admissible 
subgroups of G where E is the ring of all the Do-admissible endomorphisms of 
G; and it is a consequence of Theorems I1.2.2 and I1.2.3 together with the 
statements (ii), (v) in §II.2, that the right-ideals in E are two-sided; and that 
every two-sided ideal different from 0 in E is a power of the uniquely deter- 
mined prime ideal P in E. It is a consequence of (B), (C), Theorems 1.3.7 
and [1.2.4 that every left-ideal in E is two-sided; and thus we have shown 
that Dp is the set of all the admissible subgroups of the primary abelian opera- 
tor group G over E; and this completes the proof. 

REMARK. The condition (D, 6) entering into our formulation of the Theo- 
rem III.5.1 is patently not necessary for the existence of the primary abelian 
operator group G over E. If we substitute for (D, 6) the condition 

(D) g is contained in an element which satisfies (A), (B), (C), (D, 6), 
then it is readily verified that (D) is necessary and sufficient for the existence 
of the primary abelian operator group G over E. 

It is finally an obvious consequence of Theorem II.6.3 that one has to 
add the conditions (i), (ii) of Theorem II.6.3 to these conditions (A), (B), 


1942] PROJECTIVE SPACES AND ABELIAN GROUPS 343 


(C), (D, 6) (or (D)), in order to assure that the Dedekind set of the parts of g 
is essentially the same as the set of al/ the subgroups of a suitable primary 
abelian group. 


UNIVERSITY OF ILLINOIs, 
Urpana, ILL. 


SUFFICIENT CONDITIONS FOR A WEAK RELATIVE 
MINIMUM IN THE PROBLEM OF BOLZA 


BY 
E. J. MCSHANE 


1. Introduction. The past decade has brought forth great advances in the 
theory of the Bolza problem in the calculus of variations and in the theory of 
the problems (Lagrange, Mayer, and so on) subsumed under it. Ten years 
ago the necessary conditions of Weierstrass, Clebsch and Jacobi (or Mayer) 
were established only for minimizing curves normal on every subarc, while 
the sufficiency theorems needed even more drastic normality assumptions. 
Now the sufficiency theorems are established under the assumption that La- 
grange multipliers 4920, Ai(x), Am(x) exist with which the curve Ew 
satisfies the Euler equation, the transversality condition, and the strength- 
ened Weierstrass, Clebsch and Jacobi conditions. The Euler equation, trans- 
versality condition, Weierstrass condition and Clebsch condition are proved 
necessary with no normality assumptions. Yet normality requirements have 
not been entirely dispensed with. I have shown(') that for minimizing curves 
with order of abnormality 0 or 1 there are multipliers with which all the 
standard necessary conditions are satisfied. But an example shows that mini- 
mizing curves exist, having order of abnormality 2, which do not satisfy all 
the standard necessary conditions with any multipliers. Thus if the gap is to 
be closed and the necessary and sufficient conditions brought together for 
abnormal problems, our only hope is to strengthen the very strong sufficiency 
theorems of Hestenes, Morse, and Reid. 

In a paper to be published in the American Mathematical Monthly, I have 
considered the problem of minimizing a function f°(x) =f*(x', - -- , x") sub- 
ject to conditions 


(1.1) f(x) = 0 (6 = 1,---,m). 


Subject to fairly obvious conditions of definition and differentiability, I have 
shown that the following condition is necessary in order that f* have a mini- 
mum subject to (1.1) at a point x satisfying (1.1). 


(N) To each set (u', -- - , u”) satisfying the conditions(?) 
(1.2) = 0 (8 = 1,---,m) 


Presented to the Society, May 2, 1941; received by the editors April 4, 1941, and, in revised 
form, October 11, 1941. 

(1) On the second variation in certain abnormal problems of the calculus of variations, American 
Journal of Mathematics, vol. 63 (1941), pp. 516-530. 

(?) We use the summation convention, summing on all repeated indices. 


344 


PROBLEM OF BOLZA 


there correspond multipliers lo 20, lh, - + + , ln not all zero such that 
(1.3) = 0 

and 

(1.4) > O. 


Correspondingly, we show that the following condition is sufficient for 
f(x) to have a proper minimum subject to (1.1) at a point x» satisfying (1.1). 


(S) To each set of numbers (u',---, u") not all sero satisfying (1.2) there 
corresponds a set of multipliers ly)=0, hi, - - - , 1m such that (1.3) holds and the left 
member of (1.4) is positive. 


Here we have no normality assumptions whatever, and still the gap be- 
tween conditions (N) and (S) is no greater than that between the necessary 
condition and the sufficient conditions for a minimum of a function f(x) of a 
single real variable, without side conditions. The distinctive feature of condi- 
tions (N) and (S) is the dependence of the multipliers J», - - - , 1, on the solu- 
tions u!,--- , u* of equations (1.2). 

Let us now consider the Bolza problem of minimizing a functional 


in the class of functions 

(1.6) y = yi(x) i Se 
satisfying certain differential equations 

(1.7) o*(x, y, = 0 (a=1,---,m<n;% SxS x2) 
and certain end conditions 

(1.8) (x1), X2, y(x2)) = 0 (u=1,---, pS 2n+ 2). 


Under the usual hypotheses on the functions, we are led by the theorems of 
the preceding paragraph to the following conjectures. 


ConjECTURE (N). If a curve 
(1.9) Ew: yx), 


minimizes the functional (1.5) in the class of curves satisfying (1.7) and (1.8) 
then for each set of functions n‘(x) and numbers £1, &2 which satisfy the equations 
of variation of (1.7) and (1. me there are multipliers \®° = 0, \*(x) not all zero such 
that for the function 


(1.10) F(z, y, ¥,») = n°f(x, A*(x)o*(x, y, 9’) 


346 E. J. MCSHANE : [September 


the Euler equations, transversality condition, Weierstrass condition and Clebsch 
condition are satisfied, and the second variation formed in the usual way from F 
and n and the end functions is non-negative. 


ConyjJECTURE (S). In order that a smooth curve (1.9) shall give a strong proper 
relative minimum to the functional (1.5) in the class of curves satisfying (1.7) and 
(1.8), it is sufficient that the following condition be satisfied. To each nonidenti- 
cally zero set [n‘(x), £1, 2] satisfying the equations of variation of (1.7) and (1.8) 
there shall correspond multipliers 820, \*(x) with which the Euler equation, 
transversality condition and strengthened Weierstrass and Clebsch conditions 
hold, and the second variation is positive. , 


These conjectures are now being investigated, the first by Miss Mary Jane 
Cox and the second by Mr. Franklin G. Myers(*). The purpose of the present 
paper is to establish the analogue of Conjecture (S) for weak relative minima. 
The proof is made by expansion methods, not as a matter of choice but rather 
as a matter of necessity. The field theory hardly seems applicable. We cannot 
even find a conjugate set of accessory extremals; worse, we cannot even set 
up an accessory problem, because of the dependence of the multipliers on the 
n'(x). 

We shall obtain a sufficiency theorem for the parametric form of the prob- 
lem, and from this we shall deduce a theorem for the non-parametric problem. 

2. Statement of the problem. We shall study the Bolza problem in para- 
metric form. On an open set R; of points (y, r)=(y®,---,y",7°,---, 7") in 
(2n-+-2)-dimensional space we are given functions 


tty, = Ply, 9’) (6 =1,---,m<n) 


of class C2. We assume that if (y, r) is in R; so is (y, kr) for all k>0, and that f 
and the ¢ are positively homogeneous of degree 1 in 7. Also, we are given 
functions 


6(a), T**(a) (4 =0,1,---,#;s = 1, 2) 


defined and of class C? on an open set R; in an r-dimensional space of points 
(al, , ar), 
If C is a rectifiable curve, having a representation 


with absolutely continuous functions y‘(#), and @ is a point in Re, we say that 
the set (C, a) is admissible, or that C is admissible with parameters a, if for 
almost all ¢ the point (y, 9) lies in R; and satisfies the equations 


.(*) Added in proof, June 1942: Miss Cox has established Conjecture N; Dr. Myers has 
shown that the hypotheses of Conjecture S yield a semi-strong minimum, and for certain 
integrands guarantee a strong minimum. 


1942] PROBLEM OF BOLZA 347 
(2.2) =0 (8 1, m), 
and the end conditions 

(2.3) y(t.) = T**(a) (¢=0,1,---,;s = 1, 2) 


are satisfied. 
The problem of Bolza is the problem of minimizing the functional 


(2.4) HC, a) = 2) + ¥ Oa 


on the class of all admissible sets (C, a). ; 

Following Carathéodory, we shall denote by y(t) the vector (y’(#),---, 
y”’(t)) if this vector is defined and finite, and the vector (0, - - - , 0) other- 
wise. We shall denote the length of a vector by enclosing the vector between 
vertical bars; thus 


1/2 


— y2| = 
jal= 


The concept of weak relative minimum has been given several equivalent 
formulations. For simplicity of notation, let us suppose that Cp is a curve of 
class C', represented by equations 

(2.5) y = yol#) (4 


in which the functions y‘(t) are of class C' and | y’| #0. A curve C is in the 
first order ¢-neighborhood of Cy if it has a Lipschitzian representation (2.1). 
such that 


(2.6) | y(t) — yo(t)| <e (4 Sh) 
and 
(2.7) Lub. | — |/| @| 

hstst 
This neighborhood is easily seen to be independent of the particular repre- 
sentation of Co. 


We say that the functional J(C, a) has a weak relative minimum at the 
admissible set (Co, ao) if there is a positive number ¢€ such that 


J(C, a) 2 J(Co, a) 
for all admissible sets (C, a) having C in the first order e-neighborhood of Co 


348 : E. J. McSHANE [September . 


and |a—ao| <e. The minimum is proper if equality is excluded from (2.7) ex- 
cept when (C, a) =(Co, ao). In this paper we shall set forth conditions which 
ensure that a set (Co, ao) gives a proper weak relative minimum to J(C, a). 

3. Statement of the theorem. We use a slight modification of the summa- 
tion convention. The repetition of an index in a term connotes the summation 
of the values of that term over all values of the repeated index, except that 
the indices g and s are exempted; we never sum over values of g or of s. As" 
usual, for each set of numbers \°, - - - , A* we define 


(3.1) F(y, 7, = 1) + MG*(y, 1). 


Henceforth we suppose that (Co, ao) is an admissible set, the curve Cy being 
represented by (2.5) with functions y}(¢) of class C! which have | yd | >0. The 
curve Cp satisfies the Euler equations with multipliers \°, - - - , A*(é) if 


d 

(3.2) yo (t), = Fys(yo(t), yo 
(i= 0, StS). 
The set (Co, ao) satisfies the transversality condition with multipliers 

+ Frs(yolts), (ta), M(ta)) Tr (ca) 

(3.3) — 96 (tr), T's (ae) = 0 


the subscript h denoting partial differentiation with respect to a’. 
For the curve Cp it is well known that the quadratic form 


(3.4) yo (2), 


vanishes whenever the vector v is linearly dependent on y¢ (¢); that is, when- 
ever there is a number k such that 


The curve Cy is said to satisfy the strengthened Clebsch condition if for each ¢ 
in the interval (3.4) and every vector v linearly independent of y*’(¢) and satis- 
fying the equations 


(3.5) = 0 (6 =1,---,m) 


the quadratic form (3.4) is positive. 

Our definition of admissible variations is somewhat more inclusive than 
the usual definition. We define an admissible variation set to be a set 
[n(t), =[n%(t), ---, ---, u*] in which the functions 7‘(¢) are ab- 


1942] PROBLEM OF BOLZA 349 


solutely continuous, have derivatives integrable together with their squares, 
and satisfy the equations of variations of (2.2), that is, the equations 

(6 = 1,- Sth), 
for amost all values of ¢ in the interval [h, ¢], and the numbers u* satisfy the 
equations of variation of (2.3), which are 
(3.7) n (te) — Th = 0 (s = 1,2;i=0,1,---,n). 


If [, u] is an admissible variation set, and \°, - - - , A"(¢) are multipliers, we 
define the second variation due to [n, u] by the equation 


te 
(3.8) Taln, = + f 2, 


where 


Dae = + Fet(-yo(ts), (te), Tax(aro) 


(3.9) a 
— F,(yo(ts), yo (t1), Mt1)) 


and 
(3.10) w(t, p) = + 2F yt-mip! + 


the arguments of the functions in the right member being (yo(¢), yo (#), A(é)). 
An admissible variation set [, u] will be called essentially null if 


(3.11) = 0 (hw i,---,9 


and there is a function p(¢) such that 


é » 
(3.12) n (t) = p(t) yo (é) (4: StS 
For such sets the second variation is known to have the value 0. 
We can now state our principal theorem. 


THEOREM I. Let the following hypotheses be satisfied. 

(1) The set [Co, ao] is admissible and the curve Co is a simple arc of class(*) 
C?, represented by equation (2.5) with functions y(t) of class C?. 

(2) For all t in the interval [t;, tz] the matrix 


(2))|| 
has rank m. 
(3) To each admissible variation set [n, u] which is not essentially null there 
corresponds a set of continuous multipliers 20, , A*(t) with which 


’ (*) If we assume C to be of class C1, in the presence of hypothesis (3) we can show by the 
Hilbert differentiability theorem that it must be of class C*. 


350 E. J. MCSHANE |September 


the Euler equations, transversality condition and strengthened Clebsch condition 
are satisfied, and with which the inequality 


(3.13) Jo(n, u, > 0 


holds. 
Then (Co, ao) gives J(C, a) a proper weak relative minimum on the class of 
admissible sets (C, a). 


4. A change of parameter. There is no loss of generality in assuming that 
(2.3) is the representation of Cp in terms of arc length, so that 4:=0 and & is 
the length of Cy and 


(4.1) | yo (#)| = 1 (44 S¢ 


Our proof will be indirect; we assume the theorem false, and arrive (in $9) 
at a contradiction. 

If the theorem is false, there exists a sequence of admissible sets [C,, a] 
(q=1, 2, 3,-- +) with the following properties. Each set (C,, a,) is distinct 
from (Co, ao). The curves C, have Lipschitzian representations 


(4.2) Cy: = StS th) 
such that 


(4.3) lim = yo(?) 


uniformly in te] and 


(4.4) lim (yi(t) — = 0 


uniformly in [f, é#2]. The a, satisfy 
(4.5) lim = 
ee 
And finally 
(4.6) &) J(Co, ao) (q=1,2,---). 


For convenience in our proofs it is highly desirable to choose a particular 
type of representation of the curves C,. Specifically, we shall prove the follow- 
ing statement. 


(4.7) There is no loss of generality in assuming that the representations (4.2) 
satisfy the equations 


(4.8) [ye(t) — = + By, 
where A, and B, are constants. 


[4 


1942] PROBLEM OF BOLZA 351 


The range of definition of the functions y(¢) may easily be extended to 
an interval 4: —¢€<t<#+€ in such a way that they remain of class C? and the 
curve y‘={(¢) remains a simple arc. Consider the equations 

(4.9) [y — yo(t)] yo (4) — At B= 0. 


The equations have the initial solutions 


(4.10) hStSt; A=B=0. 


On a neighborhood of the set (4.10) the left member of (4.9) is of class C'; 
on the set (4.10) the partial derivative of the left member with respect to # 
has the. value 1. Hence by a known theorem on implicit functions the equa- 
tion (4.9) has a solution 


(4.11) t = ty, A, B) 
defined and of class C' for all (y, A, B) in a set 
(4.12) |A| <8 <8, |y— <8 (4 St S35 > 0) 


and assuming values in the interval (4:—€, t2+€). 

The constants A,, B, of (4.8) are determined by giving ¢ the values 4, é. 
By (4.1) and (4.3) the left member of (4.8) approaches zero uniformly, hence 
A, and B, both tend to zero as gq «©. We may therefore assume 


(4.13) |B| <6 


for all g. For all but a finite number of values of g, which we discard, each 
point (7) (4437S) lies within a distance 5 of some point of C. Hence the 
functions 


(4.14) te(r) = Ag, Bg) (4: S 7 S te) 


are defined and satisfy the equation 


(4.15) — yo (te(r)) ]¥0 = + By. 


The functions /,(r) are clearly Lipschitzian. The constants A, and B, were 
so chosen that equation (4.8) is satisfied at 4, and #4, whence 


(4.16) te(ts) = th, = te. 


By (4.3), y¢(r) tends to yo(r) uniformly in the interval [t, 4], so by the defini- 
tion (4.14) we have 


(4.17) lim #,(r) = #(yo(7), 0, 0) = 


uniformly for 4:37 St. Since: 
| — yolte(7)) | S| — yo(r) | +] solr) — yolte(r)) |, 


E 


352 E. J. MCSHANE 


relations (4.3) and (4.17) imply 
(4.18) lim | ye(r) — yo(te(r)) | = 0 


uniformly on She. 

Let M, be the set of values of 7 in 4: $7 Stz for which yj (7) is defined; this 
set constitutes almost all of the interval #; S7 St. On this set the function é,(r) 
also has a derivative, as we see from (4.14). For all r in M, the inequality 


(4.19) | (7) — | S| ye (7) — | — 96 
holds. So if y is an arbitrary positive number, for all large g the inequality 
(4.20) | (7) — (te(7)) | < 


holds on M,, as follows from (4.19), (4.4), (4.17) and the continuity of y¢. 
By differentiating both members of (4.15) we find that on M, the equation 


(4.21) [Ag +1 — { — yolte(r))} yo" (7) = ¥0 


holds. Let 7 be an arbitrary positive number less than 1. Since A, tends to 
zero and (4.18) holds, the quantity in square brackets in (4.21) lies between 
(1+-7/2n)-"? and (1+7/2n)"/? for all sufficiently large g. Since | yd | =1, by 
(4.20) with proper choice of (5) y the right member of (4.21) lies between the 
same bounds for all large g. Hence for all sufficiently large g we have 


(4.22) (1+2) <wm<itz. 
2n 2n 


In particular, if we choose »=1 we find that for all but a finite number 
of values of g (which we discard from further consideration) the value of ¢/ (r) 
exceeds 2/3 on M,, which is almost all of 4:37. Hence ¢,(r) has a Lip- 
schitzian inverse; we denote it by 7,(¢). By (4.16) we see that 


Te(ti) = th, = te. 


If N, is the image of M, under the mapping ¢=/,(r), then N, constitutes al- 
most all of 4:S¢Sé. On it the derivative of r,(¢) exists and is the reciprocal 
of the derivative of ¢,(7), so by (4.22) 


a 
(4.23) (1+2) 


for all ¢ in N,, g sufficiently large. 
We now show that the equations 


(®) The choice y = will serve. 


a 
[September 


1942] PROBLEM OF BOLZA 353 


form the desired representation of C,. By (4.15) the equation (4.8) is satisfied. 
By (4.18), the new functions satisfy the convergence condition (4.3). Let 9 be 
an arbitrary number between 0 and 1. In (4.20) we choose y =7/4n. For all 
large g inequalities (4.20) and (4.23) hold on N,; and ¢ being arc length on yo, 
we have 


d 


n n 


n 


By integrating we see that each component 


ye(re(t)) — 


satisfies a Lipschitz condition of constant /n, so the derivative has absolute 
value at most 7/n where it is defined. Thus 


(4.24) | {ye(re(t)) — yo(t)}*| Sa 


whenever all the derivatives are defined. Elsewhere inequality (4.24) is trivial, 
the left member being zero. That is, the left member of (4.24) tends uniformly 
to zero as g— ©, completing the proof. 

5. A convergence lemma. For each g we define a non-negative number 
k, by the equation 


These numbers are actually positive; otherwise (C,, a,) would be identical 
with (Co, ao), contrary to hypothesis. Next we define 


(5.2) the = (ag — a0) (= = 


(5.3) ne(t) = (ye(t) — yolt))/he 

1,2,--- ;4 
From the preceding equations we have at once for each g the equation 


(5.4) | wel? + max | nel) |)* + f = 1, 


so that each summand on the left is at most 1. By the Bolzano-Weierstrass 


354 E. J. MCSHANE [September 


theorem we can select a subsequence of the u, which converges to a limit 1. 
There is no loss of generality in supposing that {u,} is already such a se- 
quence, so that 


Let (a1, B:), - - , (a1, B:) be a set of nonoverlapping subintervals of tz], 
and let E be the set consisting of the sum of these intervals. Then by the in- 
equality of Schwarz 


1 
j=1 1 a 


S 1-[mE}? = [ (6: — 


In particular, letting /=1 we see that the 4, are equi-continuous. Since 
they have the uniform bound 1 by (5.4), we know by Ascoli’s theorem that 
the sequence {ne} contains a subsequence converging uniformly to a limit 


function 70(¢). We may suppose that {nq} is already such a subsequence, so 
that 


(5.7) lim = no(2) 


uniformly on the interval [;, ¢]. 
If we let g tend to © in (5.6) we obtain 


1/2 
D> | 20(8;) — | [ D> (6; - a) | 
j=l 


so that the functions (¢) are absolutely continuous. We wish now to show 
that their derivatives have integrable squares. 


Lemma 1(°). Under the hypotheses on the nq, the squares of the derivatives of 
the no(t) are summable, and 


ts te 
(5.8) f | lim inf < 1. 


(*) This is in fact a corollary of almost any theorem on semi-continuity of integrals in non- 
parametric form. Lemma 3 is also a consequence of known theorems. But it seems preferable to 
give the fairly simple proofs of these lemmas rather than refer the reader to some exposition 
containing complications not essential for our present needs. 


qo 
= 0, 1,--+,m) 


1942) PROBLEM OF BOLZA 355 

For each positive integer k we subdivide the interval [#, %] into 2* equal 
subintervals by points 
(5.9) < = by, 


and define piecewise linear functions p}(t) which coincide with 7‘(¢) at each 
7, (J=1,---, 2*+1) and are linear between. Since the derivatives of these 
functions are constant on each subinterval we find 


2k 


(5.10) f = ated — 90. 


1 
By Schwarz’ inequality, 
| — |* [ as| (rus — 7) | 


Hence 
This, with (5.10), yields 
1 1 


ec 


The integrand on the right is non-negative, and except on the set of measure 
zero on which one or more of the functions 7, fi, -- - are non-differentiable 
its limit as kK © is | jo(t)|?. By Fatou’s lemma, 


te ts tz 
(5.13) tim ing f tim int f f | 
ty ty ty 


This establishes the lemma. 
6. The equations of variation. In order to show that the mo(¢) satisfy the 
equations of variation (3.6) it is convenient to prove a lemma. 


Lemma 2. If g(t) is summable together with its square on the interval [t,, ta], 
then 
é F F ‘ 
6.1) tim — dat = tim f — = 0 
(4=0,1,--+,m). 
The vanishing of the first limit in (6.1) is easily established, for 


1 


356 E. J. McSHANE [September 


and the right member tends to zero by (5.9). Consider then the other limit. 
Let ¢ be an arbitrary positive number. As is well known, it is possible to find a 
polynomial p(¢) such that 


(6.3) f — < #/16. 


By (5.4) and (5.8), we have 


So by Minkowski’s inequality 
ia 


From this and Schwarz’ inequality we obtain 
4. 
f (iq — p-(%q — to)dt 


ts 1/2 1/2 
<a, 


By integration by parts, 
The first term on the right tends to zero by (5.7), and the second tends to 


zero by the part of the lemma already proved. Therefore for all g greater than 
a certain g, we have 


(6.6) Lf - < ¢/2. 
Now (6.4) and (6.6) imply 
| f inas| 


if g>q., and our lemma is established. 
The curves Cy and C, are admissible; so the equations 


(6.7) (volt), Volt) = (volt), = 0 
hold for almost all ¢ in the interval [t,, #2]. Hence, recalling (5.3), the equation 


(6.8) At’ + Be (ag) = 0 


&4 


1942] PROBLEM OF BOLZA 


holds for almost all ¢, the coefficients being defined by equations 


1 
A) = + rLye(t) — + — 96 Dar, 


(6.9) 

From (4.3) and (4.4) we deduce 

(6.10) lim (t) = dys(yo(2), 96 

(6.11) lim (t) = (2)) 

uniformly on t; S¢ St. For each such ¢ we have, by (6.8), 

(6.12) 

f 96) — BY) 


By the Schwarz inequality, with (5.4) and (6.11), the right member of 
(6.12) approaches 0 as g—>«. The limit of the left member is readily found 
with the help of (6.10) and Lemma 2; we obtain 


(6.13) f 96 + dt = 0. 


(Here and henceforth we indicate the arguments only in the first function in 
a bracketed expression whenever the remaining terms have the same argu- 
ments.) By differentiating both members of (6.13) we find that no(¢) satisfies 
equations (3.6) for almost all ¢ in the interval 4; St Sh. 

Since the sets (Co, ao) and (C,, a.) are admissible, the equations 


yolts) = 
Volts) T 
are satisfied. With (5.2) and (5.3), this implies 


(6.14) (i = 0,1,---,;s = 1, 2) 


(6.15) = T "(exo + — 


So by the theorem of mean value there is an &, on the line segment joining a 
and a, such that 


(6.16) melts) = Th 


387 


358 E. J. McSHANE : [September 


By passage to the limit we find 


(6.17) no(ts) = Ty (ao)uo (i =0,1,---, = 1, 2). 


Thus it has been shown that [mo, uo] is an admissible set, as defined in §3. 

7. A semi-continuity proof. In the course of our proof we shall have need 
of a generalization of Lemma 1. Let us suppose that a;,(t), b:;(t), c(t) 
(t, j7=0, 1,---, m) are functions defined and continuous on the interval 
St Ste. We define 


ta 
(7.1) I(n) = + + 

Concerning such integrals we prove a sequence of three lemmas, the last of 
which is the one needed later. 


Lemna 3. If for each t in the interval [t,, t2] the quadratic form cip‘v! is posi- 
tive definite, then 


(7.2) lim inf I(ng) 2 I(n0). 


It is evident that there is no loss of generality in assuming 
(7.3) a;(t) = a(t), = (4: S te; i, 7 = 0,1,.---, 2). 
We readily compute © 
+ 2(ain + bssia)(ne — 10) 
+ + (ie — Ha) 
+ — — 10) 
+ — — 40) 
+ cise — (ia — 
The last term is non-negative, hence 


Ha) & ono) +2 f nad - de 


+2 (bio + — to) dt 


i. 
+ | — — 


te 


(7.5) 


1942] PROBLEM OF BOLZA 359 


The second and third terms in the right member approach zero by Lemma 2. 
The fourth tends to zero by (5.7). The last term can be shown to approach 
zero by using Schwarz’ inequality and recalling (5.7) and (5.4). This estab- 
lishes the lemma. 


Lemna 4. If for each t in the interval [t;, te] the quadratic form 
ci vi 


1s positive for all non-null vectors v which satisfy the equations 


(7.6) = 0 (6 =1,---,m), 
then inequality (7.2) is satisfied. | 


Let E be the set of points (¢, v) in (n+2)-dimensional space satisfying the 
conditions t;St<tz, |v| =1. For each positive integer p we define U, to be 
the set of all points (¢, v) at which 


(7.7) +> 96 > 0. 


Every point of E which satisfies (7.6) is in U, for all »; every point of E at 
which (7.6) is false is in U, if p is large enough. By the Borel theorem a finite 
number of the U;, say those with subscripts p;, --- , px, cover E. Let N be 
the greatest of these subscripts; then 


(7.8) + N > 0 
for i; St<tz and |v| =1. By homogeneity, (7.8) continues to hold if 4; St<tz 


and |»| #0. 
Now define 


(7.9) = f 


t 


‘ 
+ bina + + ND + dt. 
1 


The terms quadratic in 9 constitute the left member of (7.8), so Iy satisfies 
the hypotheses.of Lemma 3, and 


(7.10) lim inf I(ng) 2 In(no). 


Since 7 satisfies the equations (3.6), the terms added to (mo) in (7.9) 
vanish, so that 


(7.11) I(no) = In(no). 
This is not true of 7,. But from (6.8), with (7.9) and (7.1), we find 


E. J. MCSHANE [September 


te 
— = f + emia’ + de 


(7.12) ij 


= WN { — A, Ag 
+ — Be + [bribes — Be ‘Be } dt. 
The quantities in square brackets tend uniformly to zero, by (6.10) and (6.11), 
and the integrals of the absolute values of their coefficients are bounded; so 


(7.13) lim — = 0. 


From (7.10), (7.11) and (7.13) we obtain the conclusion (7.2) of our lemma. 


Lemna 5. If for each t in the interval [t;, tz] the quadratic form cw‘v! is non- 
negative whenever the vector v is linearly independent of y¢ (t) and satisfies equa- 
tions (7.6), then inequality (7.2) is satisfied. 


Let ¢ be an arbitrary positive number, and let 


(7.14) I.(n) = f [as + + + en dt. 


The quadratic form c;w‘v/ is here replaced by 
+ 


This is positive for all non-null vectors v which satisfy equations (7.6). For 
the second term is positive, while the first is non-negative, by hypothesis if v 
is linearly independent of yg and by continuity if v is a multiple of yd. So I, 
satisfies the hypotheses of Lemma 4, and 


(7.15) lim 2 
From (5.4) we deduce 


(7.16) OS] dt 


Hence 


(7.17) I(no) lim inf I.(nq) = lim inf + «. 


Since ¢€ is an arbitrary positive number, this implies inequality (7.2), ard the 
proof of the lemma is complete. 

8. First case. We now distinguish two possible cases. 

Case 1. Either uox(0, -- - , 0) or else the functions ni(t) do not all vanish 
identically. 


360 


1942] PROBLEM OF BOLZA 361 


Case Il. The numbers ui are all 0 and the functions nj(t) all vanish identi- 
cally. 

In this section ‘we shall discuss the first case. 

Under the hypotheses of Case I, the set [7, «] is not essentially null. This 
is obvious from the definition if some u* is not zero. If all the u* have the value 
0, we have by (4.8) and (5.3) 


(8.1) ne(t)yo (t) = (Ag/ka)t + 


The left member converges uniformly as 0, so its limit niyf/ must be 
linear in ¢. But since %)=0 equations (6.17) show that this linear function 
vanishes at ¢; and at és, so it is identically zero: 


(8.2) no(t)yo'(t) = 0 St Sb). 


Now cannot have the form (t) (i=0, - - , m); for then (8.2) would 
give p=0, contrary to the hypothesis that the nj do not all vanish identically. 

By the hypotheses of Theorem I, there are multipliers\° 2 0, A*(#),- - -, 
with which the set (Co, ao) satisfies the Euler equations, transversality condi- 
tion and strengthened Clebsch condition and with which J2(m0, %, \) is posi- 
tive. This last condition will not be used until the last paragraph of this 
section. 

Since \° is non-negative, inequality (4.6) implies 


(8.3) NI (Cg, aq) — (Co, ao) S 0. 


Recalling that along an admissible curve we have F=)*f, this can be written 
in the form 


(8.4) — + f Sn) — 0. 


By Taylor’s theorem with integral form of remainder, this implies (with (5.2) 
and (5.3)) 


1 
+ Red” f (1 — + 
0 


tz 1 
ty 0 


of 
+ 2F + F | drdt s 0. 
To the third term in (8.5) we apply the usual integration by parts. Since 


362 E. J. McSHANE [September 

by hypothesis the Euler equations are satisfied, this term -has the value 

(8.6) ang | 

But 

i i i 
kang(ts) = Volts) — Volts) 
(8.7) T*(ag) — T*(ao) 
= (a0) + ke f (1 — + 

0 

We substitute this in (8.6), and substitute expression (8.6) for the third term 


in (8.5). Since by hypothesis the transversality condition is satisfied, inequal- 
ity (8.5) yields, after division by &?/2, 


D, =2 f — + 
0 


+ Fya(yolts), (ta), + 
— yolts), 96 (ts), Mia) + Theta) } 


tg 1 ‘4 
42 f f (1 — 2) + + 
u 0 


+ + 0. 
Therefore it is clear that 


(8.9) lim sup D, = 0. 


Since by (5.3) 
yall) + = yo) + — 90], 
with a like equation for the derivatives, we see by (4.3) and (4.4) that 
(8.10) + Theta + theta = Yor 


uniformly for 0S7 <1 and 4;S¢St. Relations similar to (8.10) hold for the 
other coefficients in (8.8). Hence if ¢ is an arbitrary positive number, for all 
sufficiently large values of g the replacement of k,.by 0 in (8.8) alters each 
coefficient by less than ¢. But after this replacement the variable 7 has dis- 
appeared from (8.8) except in the factors (1—7), whose integral from 0 to 1 
has the value 1/2. The result is that the left member of (8.8): takes the form 
Jo(nq, Ug, ) (cf. (3.8)). Since each coefficient was changed by less than ¢e, we 
thus find with the help of Schwarz’ inequality 


PROBLEM OF BOLZA 363 
r h 2 te n i i 2 
| De — Jalne %q »)| S (Da) tef (x {| +| a 
ty 


+ 200 + Df + art. 


The coefficient of € is bounded, since (5.4) is satisfied, and ¢ is arbitrary. Hence 
(8.12) lim [Dy — )] = 0. 


(8.11) 


Relations (8.9) and (8.12) imply 
(8.13) lim sup J2(nq, ug, A) S 0. 


qo 


By (5.5) we have 
(8.14) lim = 


qe 
The integral in (3.8) satisfies the hypotheses of Lemma 3, since the strength- 


ened Clebsch condition holds. Hence from Lemma 3 and equation (8.14) we 
obtain 


(8.15) lim inf Jo(ng, 4) 2 J2(no, to, d). 

But now we have reached our desired contradiction. For by the choice 
of the \° and A*(¢) the right member of (8.15) is positive, so inequalities (8.13) 
and (8.15) are incompatible. 

9. Second case. We still have to dispose of Case II, in which the u and 
no(t) are all zero. The hypotheses of Theorem I do not mention such variation 
sets, So we must prove a lemma. 


Lemma 6. Under the hypotheses of Theorem 1, there exist admissible varia- 
tion sets which are not essentially null. 


Bliss(7) has shown that there exist functions $7(¢, 7) of class C? (y=m 
+1,-+-+,m-+1) such that the determinant 
yo(t), 
ora(t, 


does not vanish on the interval 4; St St. In the interval we choose »+3 points 
7, such that 


(9.1) 


hSm<t2< Sk. 


(7) G. A. Bliss, The problem of Mayer with variable end points, these Transactions, vol. 19 
(1918), pp. 305-314. 


364 E. J. McSHANE . [September 


For /=1, - - - , m+2 we choose functions (k =1, - - - , m+1) with the fol- 
lowing properties. If k=1,---, m, then {j(t) is identically zero. Except on 
[r1, Ti41] the other (F(t) also vanish identically. On 7141] the are con- 
stants, and the vector 


is linearly independent of 


This last condition can be satisfied since by hypothesis m is less than n, so 
that we have the free choice of at least the last two components of the vec- 
tor (9.2). 

By known theorems on differential equations, for ]/=1,---, +2 the 
equations 


(9.4) dyt(yo, )\Hi+ Yor ye = 0, yé) Hy = 

(8B 
have unique solutions Hj vanishing at 4. The »+1 homogeneous equations 
(9.5) cH (ts) = 0 (i =0,1,-++,n) 


in the »+2 unknowns c; have a non-trivial solution. We define 


(9.6) = ci (2) (1 StS h). 


Then 4(¢) satisfies the equations (3.6), because of (9.4). Since 4 vanishes at 4, 
and at #2, it satisfies (3.7) with @=0. Hence (4, @) is an admissible variation 
set. If it were essentially null, there would be a function p(¢) such that 


(9.7) # (0) = p(t) (4 StS 


This p(t) is easily seen to be continuous and to have corners only at the 
points 7;. If \ is the least integer such that c, #0, then 4‘(¢) is identically zero 
on [t:, 7]. By (9.7) and (9.6) 


Hy'(m +) = = +) 


Since H}(t,) vanishes, by (9.4) we see that for ]=\ the vector (9.2) is a multi- 
ple of (9.3), contrary to its choice. Lemma 6 is therefore established. 

Now by Lemma 6 and hypothesis (3) of Theorem I there are multipliers 
- - - , with which the Euler equations, transversality condi- 
tion and strengthened Clebsch condition hold. From the last mentioned con- 
dition we see that the form 


1942] PROBLEM OF BOLZA 365 


(9.8) F yo(t), yo 


is positive on the set of unit vectors v which are orthogonal to y¢ and satisfy 
equations (3.5). This set of vectors is bounded and closed, so on it the form 
(9.8) has a positive lower bound, which we denote by 2e. It follows that on 
the set of vectors v just described the inequality 


(9.9) — > 0 


holds, since the coefficient of « has the value 1. By homogeneity, (9.9) con- 
tinues to hold all non-null vectors v which are orthogonal to yg and satisfy 
equations (3.5). 

Let v be any vector which satisfies (3.5) and is linearly independent of y¢ . 
It can be resolved into components 


(9.10) = % + 


where vo is orthogonal to y¢ . Since » is linearly independent of y¢, the com- 
ponent v is not null. The homogeneity of F and ¢* entails the well known 
consequence 

4, B 
(9.11) yo (t)F ») = 0, Yoon = 0. 


Now yé and v both satisfy the linear equations (3.5), hence by (9.10) 1» also 
satisfies those equations. Therefore (9.9) holds with v in place of v, and with 
the help of (9.11) we deduce , 


(9.12) Fetrs(yo, — — (yo) ] = — enon > 0. 
Inequality (9.12) shows that the integral 


(9.13) I.(n) = f ni) — (909) J} at 


satisfies the hypotheses of Lemma 3, so that 
(9.14) lim inf I,(ng) 2 I.(no) = 0. 
Since u, approaches “»=0 as gq, this and (3.8) together imply 


By (4.8) and (5.3) the function 
yo (#)n¢(t) 


is linear in ¢, and it converges uniformly to zero, since mo is zero. Except on a 
set of measure zero we have ' 


E. J. MCSHANE 
i, typ @ 
Yo = (Yona) — Yo Nes 
and both terms on the right tend uniformly to zero. So the last term in the 
square bracket in (9.15) can be omitted without affecting the limit. In (5.4) 


the first and second terms tend to zero, so the third term tends to 1 as gq ©. 
Thus (9.15) implies 


(9.16) lim inf J2(ng, A). — O. 


On the other hand, the considerations leading, to inequality (8.13) are 
applicable to Case II as well to Case I, so inequality (8.13) must hold. This 
contradicts (9.16). Hence in each of the two possible cases we have arrived 
at a contradiction, and Theorem I is established. 

10. Statement of problem in non-parametric form. From Theorem I we 
can deduce its analogue for problems in non-parametric form. We use the 
formulation due to Morse and Myers(**). The functions 


P(x, 2, p) = P(x, (6 = <n) 
will be supposed to be defined and of class C? on an open point set S; in 
(x, 2, p)-space. The functions T“(a) ({=0,---, ;s=1, 2) are defined and 
of class C? on an open set Ry in (a, - - - , a”)-space. An admissible set [z, a] 
is a set of functions 24(x) absolutely continuous on an interval [x:, x2] such 
that for almost all x in [x:, x2] the point (x, (x), ¢(x)) is in S, and satisfies 
the equations 


(10.1) (x, a(x), &(x)) = 0 (8 ™ 1, m), 
together with a set of parameters a in R, with which the end conditions 
(10.2) % = T (a), 2°(x,) = T**(a) (c= = 1, 2) 


are satisfied. 
The problem of Bolza in non-parametric form is the problem of minimizing 
the functional 


(10.3) J [z, a] = 0(a) + f z, 


in the class of admissible sets [z, a]. 
Let [z0, ao] be an admissible set in which the functions 


(10.4) = S S = 1,2,---, m) 


(8) M. Morse and S. B. Myers, The problems of Lagrange and Mayer with variable end points, 
Proceedings of the American Academy of Arts and Sciences, vol. 66 (1931), pp. 235-253. 

(*) M. R. Hestenes, Sufficient conditions for the problem of Bolza in the calculus of variations, 
these Transactions, vol. 36 (1934), pp. 793-818. 


1942] PROBLEM OF BOLZA 367 


are of class C!. The set [zo, ao] satisfies the Euler equations with multipliers 
A(x), -- - , if the functions 
(10.5) F(x, 2, p, = 2, p) + 2, p) 


satisfy the equations 
d 

(10.6) FAs, 20; d) F x(x, 20, d) =1,---, n). 
x 


It satisfies the transversality conditions with multipliers \°, A(x), - - - , \"(x) 
if 

(10.7) + [(F — Fo) Tr + = 0 
As usual, c has the range 1, - - - , m. The square-bracketed symbol in (10.7) 
is to be understood as follows. The functions F, and so on are first evaluated 
at (X+,0, Z0(%s,0), 20 (%s,0), A(Xs,0)), S=1, 2. Then the value of the sum inside 


the square bracket is evaluated for s=1 and for s=2, and the former value 
subtracted from the latter. 


The set [zo, ao] satisfies the strengthened Clebsch condition with multipliers 
AM(x), +, A™(x) if for each x in [x1,0, x2,0] the inequality 


(10.8) F gega(x, 20(x), 20 (x), A(x))0%? > O 

(summed over c, d=1,---, holds for all sets of numbers (v', - -- , 
---, 0) satisfying the equations 

(10.9) 2, 26 (2))0° = 0 (8 = 


An admissible variation set [{(x), u] is a set consisting of functions, ab- 
solutely continuous and having derivatives whose squares are summable over 
[x1,0, x2,0] and satisfying the equations 


(10.10) x, 20, 26 + 20, 24 (x)= 0 (B= 1,---,m) 


for almost all x, and a set of numbers (u!, - - - , «”) with which the equations 


(10.11) = [Te (ceo) — (8 = 1, = 1,---, 9) 


hold. 
If [f, u] is an admissible variation set, we define the second variation due 
to [{, u] by the equation 


where 


368 E. J. MCSHANE ; [September 


= Oru(ac) + — 20 Ts + (F — 


10.13 cs cs 
+ Ts +15 Ts) 


and 
(10.14) 2w(x, = Faest(x, 20, 26, AES! + + F 4, 


The concept of a weak relative minimum will be carried over unchanged 
from the parametric problem. Let the functions (10.4) be of class C!, and con- 
sider another set of absolutely continuous functions 


(10. 15) c= 2°(x), x. 


We can use these functions to define a curve C in (n+1)-space by ‘means of 
the equation 


(10.16) x=, 2° = StS x%,c=1,---,n), 


and likewise for the set (10.4). For these curves the concept of first order 
¢-neighborhood has already been defined in §2, and so has the concept of weak 
relative minimum. 

.For problems in non-parametric form we shall establish the following ana- 
logue of Theorem I. 


THEOREM II. Let the following hypotheses be satisfied. 

(1) The set [zo, ao] defined by (10.4) is admissible, and the functions #(x) 
(c=1,---, are of class C’. 

(2) For each x in the interval [x1,0, x2,0] the matrix 


(10.17) 20(x), 2d (x))|| 


has rank m. 

(3) To each nonidentically vanishing admissible variation set [¢, u|there cor- 
responds a set of absolutely continuous multipliers =0, \*(x), - - - , with 
which the set [zo, ao] satisfies the Euler equations, transversality condition and 
strengthened Clebsch condition, and with which the inequality 


(10. 18) J2[f, u, 4] > 0 


is satisfied. 
Then the set [zo, ao] gives J[z, a] a proper weak relative minimum on the 
class of admissible sets [z, a]. 


In the next two sections we shall show that this theorem is in fact a con- 
sequence of Theorem I. 

11. Transformation into parametric form. We prove Theorem II by re- 
placing the non-parametric problem of §10 by an equivalent parametric prob- 
lem. The symbols y®, y', - - - , y* will be used as alternative names for the 


1942] PROBLEM OF BOLZA 369 


x,2!,---,2" axes in (m+1)-space. In the (2%+2)-dimensional space of points 
(y, r) =(y®,---, y", 7°,--+, 7") we define R; to be the set of points (y, 7) 
having r°>0 and such that (y®, y',---, y*, isin On Ri 
we define functions g(y, 7), f*(y, r) by the equations 
These have the continuity and homogeneity properties specified in §2. 
From (11.1) we deduce 


from which by differentiation we obtain similar identities for all existing par- 
tial derivatives not involving differentiation with respect to r°; for instance, 


(11.1) 


Moreover, identities analogous to (11.2) and its corollaries are also valid for 
each of the functions ¢°. 

We now have the data needed for setting up the parametric problem of 
minimizing the functional 


ty 
(11.4) I(C, a) = 0(a) + f 
a 


in the class of sets (C, a) for which the curves C:y =y(t) have (y, ¥) in R, for 
almost all ¢, satisfy the differential equations 


(11.5) V(y, 9) = 0 
for almost all ¢, and satisfy the end conditions 
(11.6) yi(t,.) = T*#(a) i -++,m;s = 1, 2). 


From (11.1) it is evident that if the set [z, a] with functions 2(¢) defined 
by (10.15) is admissible, and C is defined by (10.16), then (C, a) is admissible 
for the parametric problem, and 


(11.7) 1(C, a) = J[z, a]. 


(The converse also is true; if (C, a) is admissible for the parametric problem, C 
can be represented in the form (10.16), and the functions 2°(x) thus obtained 
are admissible with parameters a for the non-parametric problem. But we do 
not need this.) Theorem II will therefore be established if we can show that 
its hypotheses imply that the corresponding parametric problem satisfies the 
hypotheses of Theorem I. 

12. Verification of the hypotheses of Theorem I. Hypotheses (1) and (2) 
of Theorem I follow at once from the corresponding hypotheses of Theo- 
rem II. 


370 E. J. MCSHANE [September 


Let [n(t), «] be an admissible variation set for the parametric problem. 
We define 


(The notations x and ¢ for the independent variable are interchangeable, 
because of (10.16); and for the same reason 2°(x) and y*(x) are identical, 
c=1,2,-+-,m.) 

If we recall that 
(12.2) yo'(t) = 1 


a well known consequence of the homogeneity of the ¥* implies (with (3.6) 
and the analogues of (11.1)) 


20, 26 + 20, 26 
— + — — 08) 
= 
= 0. 
So equations (10.10) hold. Equations (10.11) follow at once from (12.1) and 
(3.7). Therefore, by hypothesis (3) of Theorem II there exist multipliers 
A(x), - - - , A"(x) with which the Euler equations, transversality con- 
dition and strengthened Clebsch condition hold for the non-parametric prob- 


lem, and also the inequality (10.18). 
Let us define 


(12.4) G(y, 7, = + MY(y, 1). 


The Euler equations for the non-parametric problem constitute the last » of 
the +1 Euler equations for the parametric problem. But from the homo- . 
geneity of G it can be shown that the equation 


d 


holds for any admissible curve y‘=~y‘(t) of class C*. Since (12.2) holds and 
the last » of the factors in braces vanish for yo(¢), the first also vanishes, and 
all +1 Euler equations are satisfied. 

From the homogeneity relation 


(12.6) G(y, 7, = 7, d), 
with (12.2) and (11.1), we find that 
(12.7) Gro(-yo, 6, = F(x, 2’, 4) — 20 Fpe(x, 20, d). 


1942} PROBLEM OF BOLZA 371 


This shows that the transversality conditions (10.7) imply the transversality 
conditions (3.3) for the parametric problem. 


Let (v°, --- , v") be a vector linearly independent of y¢ (¢) and satisfying 
the equations (3.5) with y* in place of ¢°. Define 
(12.8) w —9 y(t) 
By the analogue of (12.6) and (12.2), 


50, 26 )w = — yo) 
= yo — ov (yo, yo) 
= 0. 


Also, the w* are not all zero, since (v°, - - - , v") is not a multiple of y¢ (#). Since 
by hypothesis the strengthened Clebsch condition holds, 


(12.9) F yepa(X, 20, 20, A) wow? > 

A consequence of the homogeneity of G is 

(12. 10) = 0. 

From this and (12.8) we have 

(12.11) Griri(Yo, = F Zo, 20, A) 


which is positive by (12.9). Therefore the strengthened Clebsch condition 
holds for the parametric problem. 


Let us define 
(12.12)  2w*(t, 9, p) = Gytys(yo, yo , A)n*n’ + 2Gytrm‘p* + Grtysp'p!. 
This is a symmetric quadratic form in (n, »), whence 
(12.13) 200" (t, 0, p) = n, p) + n, p) 
and 
(12.14) p) + m, p) = B) + 4, B). 


(These identities can also be established easily by direct computation.) It is 
well known that for every set of functions ‘(¢) of class C? the equations 


d * * 
(12.15) Yo (#) wpi(t, n, 0’) — was(t, n, nt = 


are satisfied identically. We now prove a lemma. 


Lemma 7. If the functions y(t), n%(t),---, n(t) are absolutely continuous 
and the squares of their derivatives are summable, then 


372 E. J. MCSHANE 
(12.16) { (rye) 9) + 0, 0’) } dt 


Let us first suppose that the 7‘(¢) are of class C*. By an integration by 
parts, with use of (12.15), the left member of (12.16) is reduced to 


(12.17) 


The analogue of (12.6) holds for G,:, and with the help of this and (12.10) 
the expression (12.17) transforms into the right member of (12.16). Hence the 
lemma holds if the ‘(¢) are of class C?. 

By hypothesis, the derivatives of the 7‘ are of class L®), so by a known 
property of the Lebesgue integral there is a sequence of polynomials 


Pe 
such that 


ts 
(12.18) lim la — pd |*dt = 0. 
4 


By the inequality of Schwarz, the functions 
pil) + f 
4 


converge uniformly to ‘(¢). The coefficients of the form w* are continuous, so 


+ [wsilt, Pe — Jat = 0. 
For each p, the analogue of (12.16) holds, so by Schwarz’ inequality 


(12.20) { rye’) Po Pa) + (rye bu at 


= lim vp Yo, yo » d) 


é 
= vn Gyt( Yo, yo ’ 
This establishes the lemma. 


1942] PROBLEM OF BOLZA 


By definition (3.8), 


te 
(12.21) Ia(n, #, = f (t, 0, 
ts 


where 
From (12.13), (12.21) and (12.1) we deduce 


I2(n, u, d) + f) + f) } dt 


te 
9 i) * 0 ne 
+f {n yo wat(t, Yor (n Yo) ) 
+ (n¥0)’) } dt 
The integrand in the first integral in the right member is 2w*(t, ¢, ft), which 
by (12.12) is the same as 2w/(t, £, ¢), since {* vanishes identically. By Lemma 7, 
the third integral has the value 
Since £*=0, by (12.4) and (11.1) this is equal to 
(12.25) Fax, 20, 26,2) 
By the same argument, the fourth integral has the value 
By (12.14), the second integral has the same value (12.25) as the third. We 
substitute these evaluations in (12.23), and for the end values of n° and {* we 


substitute the values given by (3.7) and (10.11). On collecting terms and re- 
calling (12.7), (10.12) and (10.13), we find 


(12.27) u, d) cad u, 


which is positive by hypothesis. 


373 


374 E. J. MCSHANE ‘ [September 


We have now verified all the hypotheses of Theorem I for the parametric 
formulation of our problem, so by that theorem the set (Co, ao) gives a proper 
weak relative minimum to J(C, a) on the class of admissible sets (C, a). This 
immediately implies that [zo, ao] gives J[z, a] a proper weak relative mini- 
mum on the class of admissible sets [z, a], and Theorem II is established. 

13. Acorollary. If it were not for our unusually inclusive definition of ad- 
missible variation set, Theorem II would at once include Hestenes’ sufficiency 
theorem for weak relative minima(?*). For Hestenes assumes the hypotheses 
of Theorem II, with the additional requirement that the multipliers can be 
chosen independently of the sets [¢, «]. However, it requires some proof to 
show that in this case the assumption that the second variation is positive for 
all variation sets admissible in the sense of §3 is necessarily satisfied if the sec- 
ond variation is positive whenever [{, u] is an admissible set and the {*(x) 
are of class D'. We establish this for the normal case; as Bliss has shown("'), 
the theorem of Hestenes can be deduced from the sufficiency theorem for the 
normal problem. 

For each real number a, let us define 


Clearly 
(13.2) Qo[t, = d]. 


As in the discussion of (7.7), we can show that if a is sufficiently large the 
quadratic form 


(13.3) ND + 
is positive definite and 
(13.4) 2w(x, $, 0) + 


is positive for all nonidentically zero sets ({, v) satisfying (10.9). We choose 
such an a; then there is a positive ¢ such that 


(13.5) + = 
and 
(13.6) + agege + FF ] 


whenever t satisfies (10.9). 
Let K; be the collection of admissible sets [f, u] such that 


(#) Hestenes, loc. cit., p. 816. 
(4) G. A. Bliss, Normality and abnormality in the calculus of variations, these Transactions, 
vol. 43 (1938), pp. 365-376. 


1942] PROBLEM OF BOLZA 


72,0 
(13.7) = 1. 


71,0 
For any [f, u] in K; and any number b we have 
(13.8) Q.[f, 4] — A] = a — 


Since the forms (13.3) and (13.4) are non-negative, Q, has a non-negative 
lower bound m on the class K;. Let [f,, u,] be a sequence of sets in K, for 
which Q, tends to its lower bound m on K;. By (13.5) and (13.6), 


so the value of the expression in braces is bounded on the sequence [f,, t%¢]. 
The boundedness of the integral of | ¢,| * implies the equi-continuity of the ¢,, 
as in (5.6). The boundedness of | 240 implies the boundedness of | fa(x1,0) | ; 
and this with the equi-continuity of the ¢, implies their uniform boundedness. 
By Ascoli’s theorem, we can select a subsequence converging uniformly to a 
limit {o(x); we suppose [f,, %,] such a sequence. We may also suppose that 
the u, converge to a limit uo. As in §5, the { are absolutely continuous, and 
the squares of their derivatives are summable, and by Lemma 2 


(13.10) uo, A] S lim inf Q.[f,, 4] = m. 


But [fo, uo] also belongs to Ki, so inequality is impossible in (13.10). That is, 
wo] minimizes Q, on the class 
By (13.8), [fo, wo] also minimizes Q,_, on Ki, and 


#0, 4] = 0. 


Since Q.-m is homogeneous of degree 2 in [f, £, u] we see that Q,-» is non- 
negative for all admissible variation sets, and on this class [{o, uo] minimizes 
Qe—m: 

Now we need only a slight extension of Bliss’(!) proof of the multiplier rule 
to show that there are absolutely continuous multipliers u(x), - , u(x) 
such that for the function 


2(x, = 2w'w(x, + 20, 26 + 


the du Bois-Reymond relation is satisfied: 


Qre(x, So, fo) = a. + f Qe(x, So, fo)dx, 


(#8) G. A. Bliss, The problem of Lagrange in the calculus of variations, American Journal of 
Mathematics, vol. 52 (1930), pp. 673-744. 


375 


376 E. J. MCSHANE . [September 


where a, is a constant (c=1,--- , ”). But if the minimizing curve is normal, 
so is [fo, uo]. So we may suppose po=1. It follows that 


Qrega(x, Fo, Fo)0°v? = Fyepa(x, Zo, 26 


and this last is positive for all nonzero sets (v', - -- , v”) satisfying (10.9). 
From this we deduce without trouble that ¢» must be continuous, so that % 
is of class C’. 

Now by (13.2) and (13.8) we have 


Ja[to, to, = uo, + m=m— a. 


But since {» is of class C!, by hypothesis the left member is positive. Hence 
m —a is positive, and the minimum of J; on the class K; is positive. Jt follows 
by homogeneity that J? is positive for all nonidentically vanishing admissible 
variation sets [n, u], as was to be proved. 

Since the proofs in this section are rather long, the following remark may 
not be amiss. Most of the theory of the calculus of variations, originally de- 
veloped for functions of class D', can be extended without difficulty to ab- 
solutely continuous functions. It seems reasonable to suppose that in the ma- 
jority of specific problems in which the inequality J:[f, u, \]>0 can be 
established for functions ¢ of class D', it will be possible to use essentially 
the same proof to establish the positiveness of J for all variation sets admissi- 
ble in the sense of §3- 

14. An example. We now exhibit an example of a problem to which Theo- 
rem II applies, but which is not covered by any previously published theorem. 
A simpler example could be given, but this one will have the virtue of being 
non-trivial. It is convenient to use subscripts instead of superscripts for enu- 
meration of variables, and so on, since we need to use exponents. 

Interior to the interval [0, 1] we choose three disjoint closed intervals 
5;, 52, 5; all of the same length ¢€ (necessarily less than 1/3). It is easy to find 
functions :(x), Y2(x) of class C? on [0, 1] such that 7; has the value 1 on 6; 
and the value 0 on 4, and 43, while yz has the value 1 on 6 and the value 0 on 
5, and 63. We define 


(14.1) f(x, Ps) Pit Po + Ps — — 

The problem is to minimize ; 

(14.2) J{z, a] = f f(x, 2, 2’)dx 

in the class of sets [z:, - - - , 25, a1, O2, a3] with absolutely continuous functions 
2:(x) (¢=1,---, 5) which satisfy the end conditions 

(14.3) a = = 0 (¢=1,---, 5), 


2 = 1, &4( x2) 0, 


= a1, %9(%2) = ag, 23(%2) = 


1942] . PROBLEM OF BOLZA 


and the differential equations 
3’) + — + = 0, 
2’) = + + + = 0. 
By use of Theorem II we shall show that the set 
(14.5) 2i(x) = 0, = = as = 0 (0s x3 1) 


(14.4) 


gives J[z, a] a weak relative minimum on the class of admissible sets [z, a]. 
As usual, we define F The Euler equations sim- 
plify to the form 


and therefore are satisfied for all sets of constant multipliers. Henceforth we 
assume ), and d- constant. The transversality condition is identically satis- 
fied. The quadratic form (10.8) is 


(14.6) + 09 + 05), 
while equations (10.9) become 
= 0. 


Hence the strengthened Clebsch condition holds if and only if Xo is positive. 
Equations (10.10) take the form 


f 0, fs 0, 

and are therefore satisfied by any set ({1,---, {s) with {4 and {, constant. 
Equations (10.11) become 

= 0 (#=1,---,5), 

$1(1) = m1, $2(1) = we, = us, $4(1) = = 0. 

Thus an admissible variation set is a set [{, u] in which {4 and {5 vanish 
identically and the other three {; vanish at x =0 and are absolutely continu- 
ous and have derivatives summable with their squares, and in which the u, 
satisfy (14.7). 


The coefficients 5,; are all zero, so if we observe that certain of the terms 
in 2w are perfect differentials and make use of (14.7) we obtain 


(14.7) 


1 
+ Drauss. 


(14.8) 


If for every nonidentically null admissible variation set we can choose con- 
stants \o>0, A; and Az for which this is positive, all the hypotheses of Theo- 
rem II will be verified. 


CC 377 
M =0, 


378 E. J. MCSHANE ,, [September 


If the derivatives {; vanish almost everywhere, the {; are constants. In 
this case, by (14.7) the ¢; vanish identically and the u; also vanish, so [f, u] 
is identically null. Therefore if [¢, u] is not identically null, the integral in 
(14.8) is positive. If u:=u,=0, the choice AX =1, A1 =A. =0 serves. Otherwise 
we could choose \1 = 4(ui —12), = Then 


u] = 2(ui + + f+ > 0. 


It is easy to show(”) that it is not possible to choose any one set of multi- 
pliers with which the second variation is positive for every nonidentically zero 
admissible variation set [f, x]. 

To show that the problem is not a trivial one, in which the extremal (14.5) 
is isolated, and also that the problem does not impose any hidden end condi- 
tions, let us choose any three numbers au, a2, a3. Let 21, 22 be any Lipschitzian 
functions vanishing at x = 0 and assuming the respective values a1, a2 at x =1. 
We determine three numbers, a1, a2, a3 by the conditions 


<a; + (1/2)(ai — a3) = 0, 
(14.9) eas + = 0, 
€(a; + + a3) — a3 = 0. 


(The number ¢ was defined in the second paragraph of this section.) Let és be 
the function which has the value a; on the interval 5; (¢=1, 2, 3) and is zero 
elsewhere, and let 


%3(x) -f 33(x)dx. 
0 


By the last of equations (14.9) we have 23(1) =a3. The functions 24, 25 are de- 
termined by (14.4), with the initial values 0. If we integrate from 0 to 1 in 
(14.4), by (14.9) we find 24(1) =z5(1) =0. Hence the functions z;(x) satisfy the 
conditions (14.3) and (14.4). Furthermore, it is clear that they can be made 
to lie in an arbitrarily small first order neighborhood of (14.5) by restricting 
| é| and | é| to be uniformly small and restricting a1, a2 and as to lie near 
zero. 

15. Extension to rectifiable curves. The proofs of our sufficiency theorems 
did not depend in any essential way on the continuity of the derivatives 
yo (t) or 2¢ (x). Theorem I, for example, can be generalized by letting Cy be a 
rectifiable curve. This of course requires an investigation of the concept of 
first order neighborhood in the space of rectifiable curves. Such an investiga- 


(8) E. J. McShane, On the second variation in certain normal problems of the calculus of varia- 
tions, American Journal of Mathematics, vol. 63 (1941), §5. 


1942] PROBLEM OF BOLZA 379 


tion has already been made(“). Also the formulation of the strengthened 
Clebsch condition must be altered. The condition as stated in §3 is equiva- 
lent, when the functions yo(¢) are of class C’, to the following. 

There is a positive number ¢ such that the inequality 


holds for all t and all vectors v orthogonal to y¢ (t). 

It is this latter form which seems appropriate for extension to the case of 
rectifiable curves. We say that a rectifiable curve Co: y' = y(t), ti St Ste, satis- 
fies the strengthened Clebsch condition if (15.1) holds for almost all ¢ such 
that y¢ (¢) is defined and is different from (0, - - - , 0) and for all v orthogonal 
to yo (#). 


(4) E. J. McShane, Curve-space topologies associated with variational problems, Annali della 
R. Scuola Normale Superiore di Pisa, (2), vol. 9 (1940), pp. 45-60. 


UNIVERSITY OF VIRGINIA, 
CHARLOTTESVILLE, VA. 


A NEW CLASS OF SELF-ADJOINT BOUNDARY 
VALUE PROBLEMS 


BY 
WILLIAM T. REID 


1. Introduction. Bliss [1 ](*) has given a general definition of self-adjoint- 
ness for a differential system of the form 


yf = + (x) ] 
(1.1) 
sily] = [Misy(a) + Nizy(b)] = 0, 
Sel 
In the paper above referred to, he has also discussed in detail a special class 
of self-adjoint problems termed definitely self-adjoint. In a subsequent paper 
[2], Bliss has given a modification of the definition of definite self-adjointness 
which is weaker than that previously considered, and has shown that most 
of the properties deduced in [1] are still valid for systems which are definitely 
self-adjoint according to the new definition. 

If (1.1) is self-adjoint under the nonsingular real transformation 
2;=T7;;(x)y;(*), in both the original and modified definition of definitely 
self-adjoint problems Bliss has imposed the definiteness property of the sys- 
tem specifically on the matrix S(x) =||S;;(x)|| =|| Now if y(x), 
(t=1,---,m), isa solution of the differential equations of (1.1) for a value X 
it follows immediately that 


b b 

(1.2) f wT ly} — = af 

It may be readily verified that the definiteness property of S(x) assumed by 
Bliss could equally well have been phrased as a definiteness property of the 
quadratic functional 


f idx. 


If H[y] denotes the first member of (1.2), the theory of pencils of quad- 


Presented to the Society, September 12, 1940; received by the editors May 14, 1941; 
corrections received June 17, 1942. 

1 Numbers in square brackets refer to the bibliography at the end of this paper. 

2 In the introduction, and throughout the paper where matrix notation is not more con- 
venient, the repetition of a subscript in a single term of an expression denotes summation with 
respect to that subscript over its range of definition. 


381 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


382 W. T. REID . [November 


ratic forms in a finite number of variables suggests that a differential system 
(1.1) for which H[y] satisfies suitable conditions of definiteness may possess 
properties analogous to those enjoyed by the class of definitely self-adjoint 
problems as defined by Bliss [2]. The prime aim of this paper is to show that 
such is indeed the case. The class of self-adjoint problems herein studied, for 
which the definiteness property is placed on the functional H[y], is termed 
H-definitely self-adjoint. Moreover, it is to be emphasized that the study of 
H-definitely self-adjoint problems affords new results for systems which are 
definitely self-adjoint in the sense of Bliss. 

The definition of an H-definitely self-adjoint system is given in §2, and 
properties of the functional H[y] are presented in §3. Preliminary results for 
such a system are obtained in §4; one of the most important results therein 
contained is that of Theorem 4.3, which states that for an H-definitely self- 
adjoint system the matrix B(x) must be such that its square is identically 
zero on the interval ab. This result, which at first seems startling in aspect, 
admits certain important consequences for definitely self-adjoint systems. 
The fundamental properties of an H-definitely self-adjoint system, such as 
the reality of the characteristic values, the equality of the index and multi- 
plicity of a characteristic value, and a type of completeness property of the 
totality of characteristic solutions for such a system, are contained in §5; 
§6 is devoted to the discussion of the existence of characteristic values for 
such a system. Results for definitely self-adjoint systems are given in §7, 
whereas §8 is concerned with a special definitely self-adjoint problem which 
is related to a given system (1.1), although the system (1.1) itself may be 
neither definitely nor H-definitely self-adjoint. By the use of the results of 
the preceding section, extremizing properties of the characteristic values and 
characteristic solutions of an H-definitely self-adjoint system are established 
in §9. In §10 it is shown that an important instance of the type of boundary 
value problems associated with the calculus of variations previously studied 
by the author [9] is H-definitely self-adjoint. The connection between the 
class of problems herein treated and the boundary problems associated with 
a single linear differential equation of even order which have been studied by 
Krein [7] and Kamke [6] is indicated briefly in §11. Finally, §12 is devoted 
to the extension of the notion of H-definite self-adjointness to the case of 
systems whose coefficients are complex-valued. 

For simplicity, matrix notation is used almost exclusively in this paper. 
Square matrices with m rows and columns are denoted by capital italic letters, 
and the element in the ith row and jth column is denoted by the letter repre- 
senting the matrix with the subscript 77. Lower case italic letters signify vec- 
tors with ” components, the ith component being signified by a subscript i. 
If «= [us], the vectors [Miju;] and [u;M,] are denoted by Mu 
and uM, respectively. The scalar product uj; of two vectors is written uv. 
If a is a scalar, & is its complex conjugate, and for a vector u we write @ 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 383 


for [a]. For a matrix M=||M;,|| we use M for the transpose matrix || M;i\|. 
Finally, if the elements of M are differentiable functions, the matrix of deriva- 
tives is denoted by M’; similarly, if the components of u are differentiable 
functions, we write u’=[u/]. The norm of a vector u, [ua@]/?, is written 
norm {u}. 

2. Definition of H-definitely self-adjoint systems. In the following pages 
it will be assumed that the elements of the matrices A(x) and B(x) are real 
. single-valued continuous functions on the finite interval a $x <b and that the 
elements of B(x) are not all identically zero on this interval. The elements 
of the matrices M and N are supposed to be real constants such that the m X 2n 
matrix || M;; Ni,|| is of rank m. Moreover, because of its frequent occurrence, 
we write <[y] for the vector differential operator y’—A(x)y. The boundary 
value problem to be considered in this paper may then be written 


(2.1) = »B(x)y, sly] = My(a) + Ny(b) = 0. 
The system adjoint to (2.1) is 


(2.2) = — rzB(x), —t[z] = 2(a)P + = 0, 


where 2 [z] is the adjoint differential operator 2’-+2A (x), and p=(p,) =(Pi,), 
q= (qi) =(Qis), G=1, - - - , 2), are nm linearly independent solutions of the alge- 
braic equations Mp— Nq=0. 

According to the modified definition of Bliss [2] the system (2.1) is defi- 
nitely self-adjoint with a matrix 7, or simply definitely self-adjoint, whenever 
the following conditions are satisfied: 

(i) The system is self-adjoint under the nonsingular real transformation 
2=T(x)y; that is, for arbitrary values of \ a vector y satisfies the differential 
equations (or boundary conditions) of (2.1) if and only if the associated vector 
z=Ty satisfies the differential equations (or boundary conditions) of (2.2). 
The elements of T(x) are supposed to be of class C! on the interval ab. 

(ii) The matrix S(x) = 7(x)B(x) is symmetric on ab. 

(iii) The quadratic form uS(x)u is positive semi-definite on ab. 

(iv) There exists no nonidentically vanishing solution y of <[y]=0, 
s[y]=0 such that B(x) y(x) =0 on ab. 

The wording of hypothesis (iv) differs slightly from that used by Bliss [2, 
property (3), p. 414]. However, when (ii) and (iii) are satisfied the above 
hypothesis (iv) is readily seen to be equivalent to the property (3) of Bliss. 
For the present treatment the form (iv) is preferable. 

It is to be remarked [1, p. 569] that a nonsingular matrix T(x) whose ele- 
ments are of class C! on ab satisfies condition (i) if and only if 


TA+AT+T7’'=0, TB+BT=O0 on ab, 


(2.3) 
MT-\(a)M = NT-\(b)N. 


384 W. T. REID 


Consequently, whenever (ii) is also satisfied by T we have 
(2.4) S = TB = BT = — TB. 


For y a solution of (2.1) corresponding to a characteristic value A, relation 
(1.2) becomes in matrix and vector notation 


b 
(2.5) f yTL[y] dx = af ySy dx. 


Now the above hypothesis (iii) clearly implies a positive semi-definite char- 
acter of the quadratic functional 


b 
f ySy dx. 


Indeed, (ii) and (iii) together imply that this functional is positive for all vec- 
tors y whose components are continuous on ab and such that B(x)y(x) #0 on 
this interval. 

The quadratic functional upon which certain assumptions of definiteness 
are to be imposed in this paper is 


(2.6) Hl =f az, 
which appears as the left-hand member of (2.5). 

For convenience, we shall denote by L the linear vector space consisting 
of all vectors y satisfying the following conditions: (a) the components of y 
are real-valued and of class C! on ab; (8) s[y]=0; (y) there exists a corre- 
sponding vector g(x) with real-valued continuous components such that 
L[y] =Bg on ad. 

Instead of the above hypothesis (iii) we shall now assume the following 
condition : 

(iii)’ The quadratic functional Hy] is positive for arbitrary vectors y of L 
such that B(x)y(x) #0 on ab. 

A system (2.1) which satisfies hypotheses (i), (ii), (iii)’ and (iv) will be 
termed H-definitely self-adjoint with the matrix T, or simply H-definitely self- 
adjoint; the prefix “H-” indicates that it is the functional H[y] which pos- 
sesses the definiteness property. Correspondingly, a system which is definitely 
self-adjoint as defined by Bliss [2] might be termed S-definitely self-adjoint. 
It is to be pointed out that in the treatment of definitely self-adjoint systems, 
as well as in the present discussion of H-definitely self-adjoint systems, the 
space L occupies a central position. 

Now a linear change of parameter in (2.1), replacing X by \+Ao, has the 
effect of substituting €[y]—oBy for <[y]. Hence it is to be emphasized that 
as far as the qualitative properties of (2.1) are concerned the hypothesis (iii)’ 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 385 


is no stronger than the assumption that there is some value of > such that 
the functional 


H[y:o] = — NoBy) dx 


satisfies the definiteness property of (iii)’. 

In view of equations (2.3), a system (2.1) which is self-adjoint with a 
matrix T is also self-adjoint with the matrix —T, T or —T. In particular, 
if (2.1) is H-definitely self-adjoint with a matrix T it is also H-definitely self- 
adjoint with the matrix —7. If hypotheses (i), (ii) and (iv) are satisfied by 
(2.1) with a matrix T and the functional (2.6) is negative for arbitrary vectors 
y of L such that By #0, this functional can be replaced by one for which (iii)’ 
as stated is satisfied by using the transformation matrix —T instead of T. 
Moreover, if (2.1) is H-definitely self-adjoint with a matrix T then the adjoint. 
system (2.2), written in the form (2.1), is H-definitely self-adjoint with the 
matrix J-1. It may be readily verified that analogous results hold for defi- 
nitely self-adjoint systems. 

We shall denote by L? the linear subspace of L consisting of all vectors y 
with real components and satisfying a system C[y]=Bg, s[y]=0, where g(x) 
is also a vector of the space L. Clearly each real characteristic solution of (2.1) 
belongs to L? as well as to L. In the subsequent discussion the space L? first 
occurs in Theorem 5.4. 

3. Properties of the quadratic functional H[y]. If u and v are vectors 
whose components are of class C on ab, let H[u;v] denote the bilinear expres-_ 
sion 


In general H[u; v]~H[v; u]. However, we do have the following property. 


Lema 3.1. For a system (2.1) satisfying conditions (i) and (ii) the bilinear 
functional H{u;v]| is symmetric on the linear vector space L. 


For suppose that u and v belong to L, and that <[u]=Bg, Lio] =Bh. 
Then w= Ty? satisfies the system [w] = —ATB, t[w]=0. By a familiar argu- 
ment it then follows (see Bliss [1, p. 567]) that 


b 
- f hSudx = wu 


The result of the lemma is then immediate since 


f isu dx = f uT Bhdx = H{u; 


b 
f wBgdx = f oT Bg dx = H{p; u]. 


| 


386 W. T. REID 


Lemma 3.2. Hypotheses (ii) and (iii)’ imply H|y]=0 on L. 


For consider an arbitrary vector y of L. If By 40 on abd, then (iii)’ implies 
H|y]>0. On the other hand, if By=0 on this interval the symmetry of S 
insures 


b 
aly] = f dx = gTBy dx = 0. 


The following result is an immediate consequence of Lemma 3.1 and the 
linearity of L. 


Lemma 3.3. If the system (2.1) satisfies (i), (ii), and H[y]20 on L, then 
(3.1) {H[u;o]}? s 
for arbitrary vectors u and v of L. 


LemMaA 3.4. If (2.1) satisfies hypotheses (i) and (ii) and y, y* are charac- 
teristic functions corresponding to distinct characteristic values d, \*, then 


b 
(3.2) f 


The first equality of (3.2) follows by Theorem 8 of Bliss [1], and the sec- 
ond relation is then immediate since y*TL[y] =Ay*Sy. It is to be remarked 
that this result is true quite independent of the reality of the characteristic 
values and characteristic functions involved. 

4. Preliminary results. In this section we shall present some results for 
H-definitely self-adjoint systems which, although of individual significance, 
are preliminary to the rest of the paper. 


THEOREM 4.1. If (2.1) is H-definitely self-adjoint, then \=0 is not a char- 
acteristic value of this system. 


For suppose \=0 were a characteristic value for such a system, and de- 
note by y a corresponding real characteristic solution. The condition (iv) im- 
plies By#0 on ab, and as y clearly belongs to L it follows by (iii)’ that 
H[y]>0. On the other hand, H[y]=0 since <[y]=0. Hence \=0 is not a 
characteristic value. 

In connection with this theorem, we also have the following result. 


THEOREM 4.2. If (2.1) satisfies conditions (i) and (ii), H|y]2=0 on L, and 
\=0 is not a characteristic value, then this system is H-definitely self-adjoint. 


If \=0 is not a characteristic value for (2.1), then clearly condition (iv) 
is satisfied. Moreover, suppose that y is a particular vector of L such that 
H[y]=0. Then by Lemma 3.3 it follows that H[y;v]=0 for arbitrary vectors 
v of L. But for an arbitrary vector g whose components are continuous there 


[November 
{ 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 387 


exists, if \=0 is not a characteristic value, a unique solution of ([v]=Bg, 
s[v] =0; for this vector v, H[y;v] = f®ySg dx. Hence 0=yS=TBy, and there- 
fore By=0 on ab. That is, if y is a vector of L for which H[y]=0 it must 
be true that By=0. Hence condition (iii)’ is also satisfied by such a system, 
and it is H-definitely self-adjoint. 

As a consequence of Theorem 4.1, for an H-definitely self-adjoint sys- 
tem the functional H[y] is afforded an alternate representation. Let 
G(x, t) =||Gi,(x, t)|| denote the Green’s matrix (see, for example, Bliss [1, 
pp. 577-581]) for the incompatible system 


(4.1) Lily] =0, sly] =0. 


Then a vector y belongs to L if and only if it is of the form 
b 
(4.2) y(a) = f K(x, at 


where K(x, t) =G(x, t)B(t), and the components of g are continuous on ab. 
We may then write 


aly] = f dz = gSy dx 


b b 
f f g(x)S(x)K (a; dx dt 


b 
f f g(x) t)g(t) dex dt, 


where Ki(x, ¢) = S(x)K(x, t) = S(x)G(x, t)B(t). Similarly, if u and v are vectors 
belonging to L and [u]=Bg, Cv] =Bh, we have 


(4.3’) H[u; 0] = wore, t)g(t) dx dt. 


Now corresponding to arbitrary vectors g, h whose components are continu- 
ous there exist unique corresponding vectors u, v of L satisfying the above 
conditions. Since by Lemma 3.1 we have H[u;v]=H[v; «] it then follows that 


(4.4) K,(x, t) = Ki(t, x), 
that is, 
S(x) K(x, t) = K(t, x)S(2). 


Indeed, in the proof of (4.4) we have used in addition to hypotheses (i), (ii) 
only the condition that \=0 is not a characteristic value of (2.1). Relation 
(4.4) has been obtained by Bliss [1, p. 580], and it may be readily verified 
that his proof also uses only these conditions on the system (2.1). 


388 W. T. REID : [November 


For convenience in the presentation of the following two lemmas the func- 
tional (4.3) will be denoted by J[g]; similarly, the quantity (4.3’) will be 
written J[h; g]. The following result will be stated without proof, since it fol- 
lows readily from well known properties of Lebesgue integrals. 


Lemma 4.1. If J[g]20 for all vectors g whose components are continuous 
on ab, then this integral, taken in the sense of Lebesgue, is non-negative for all 
vectors g whose components are of integrable square on this interval. 


Now denote by L’ the extension of L obtained by considering the totality 
of vectors y such that: (a’) the components of y are real-valued and absolutely 
continuous on ab; (8’) s[y]=0; (y’) there exists a corresponding vector g 
whose components are real-valued, of Lebesgue integrable square on ab, such 
that <[y]=Bg almost everywhere on this interval. In view of Lemma 4.1, 
and the fact that we still have y(x) = {?K(x, )g(t)dt for a vector y of L’, it 
follows that the results of Lemmas 3.1, 3.2 and 3.3 remain valid for y, u and v 
vectors in L’. Moreover, since }\=0 is not a characteristic value for an 
H-definitely self-adjoint problem, it follows as in the proof of Theorem 4.2 
that H|y]=0 for a vector y of L’ if and only if By=0 on ab. That is, as far 
as the results previously established are concerned, in the definition of H-defi- 
nitely self-adjoint systems one might without further restriction have used 
the vector space L’ instead of L. As a matter of fact, this remark is valid for 
all the results obtained.in the present paper. Specifically, in this connection, it 
is to be pointed out that the results of Bliss used in §7 are valid for the space 
L’ instead of L. 

Because of the special form of K(x, ¢), and the fact that if the components 
’ of b(x) are integrable on ab then y(x) = [2G(x, t)b(t)dt is a vector whose compo- 
nents are absolutely continuous, satisfies <[y]=6(x) almost everywhere on 
ab, and s[y]=0, all the preceding results may be proved for a much more 
general linear vector space than.L’. In particular, they all hold for the space 
of vectors y satisfying the above conditions (a’), (8’), and the condition ob- 
tained by replacing in (y’) the phrase “of Lebesgue integrable square” by 
“Lebesgue integrable.” However, for a number of the subsequent results to 
remain valid, it is necessary to restrict the involved vectors to the space L’. 

We shall denote by Ki;;(x, x+) the limiting values of Ki;;(x,t) as ¢ tends 
to x through values greater than x, and write Ki(x, x+) =|| Kix, x+)||. The 
quantities K1;;(x, x —) and Ki(x, x—) are defined in a corresponding fashion. 
Since K,(x, t) = S(x)G(x, t)B(t) it follows that the elements of K; have dis- 
continuities at most along the line x=. Moreover, if Ki(x, ¢) is taken to be 
equal to Ki(x, x+) along x=#, then the elements of the resulting matrix are 
continuous in (x, on the region Ri: x St< 6, axx<b. Similarly, if K,(x, t) 
is taken to be equal to Ki(x, x —) along x =#, then the elements of the resulting 
matrix are continuous in (x, on Re: 


Lemna 4.2. If the functional J|g| defined by (4.3) be positive semi-definite for 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 389 


arbitrary vectors g whose components are continuous on ab, then Ki(x, x—) 
= Ki(x, x+) on aSx3Sb; that is, if K(x, t) be taken as equal to this common 
value along the line x=t, then the elements of Ki are continuous in (x, t) on 
asx,tsb. Moreover, the matrix K(x, *) thus defined is symmetric and positive 
semi-definite on ab. 


For convenience, we write J[g]=J:[g]+J:[g], where J: and J: are the 
integrals of the integrand of (4.3) taken over the above defined regions Ri 
and Rs, respectively. The integrals Ji{h; g], Je[h; g] are defined similarly. 
Now consider a point (xo, x9) with a<x»<b. For an arbitrary constant vec- 
tor go, denote by gi(x), (k=1, 2,---), the vector whose components are 
identically zero except on | x—x9| Sd,, where d,=c/k and c is the smaller of 
the numbers x»—a, b—xo, while on Sx we define g.=(1/dx)go, and 
on x9 Sx we set g=(—1/d:)go. Because of the continuity properties 
of the elements of K,; as described above, it is readily calculated that 
limy.. and consequently, lim:... J[g.]=0. Now 
for a second arbitrary constant vector ho, define ky(x)=0 except on 
xo—dy Sx =(1/di)ho on this subinterval. Clearly there exists 
a constant « such that H{hx| Sx, (k=1, 2,---). Moreover, in view of the 
positive semi-definite character of J proved in Lemma 4.1, we then have 
{ ge) [he] [ee], 2, - - - ), and hence lim,... ge] 
=0. Again, using .the continuity properties of the elements of K, de- 
scribed above, it is found that lim:... ge] = —hoKi(x0, xo+)go, and 
J2[he; ge] =hoKs(xo, xo—)go. Thus for arbitrary constant vectors he, 
go we have ho[Ki(x0, xo—) —K1(xo, xo+) |go=0, and consequently Ki(x, x—) 
— Ki(x, x +) =0 on a<x<b, whence it in turn follows that this relation is 
also true at the end values a and b. In the following we shall write K,(x, x) for 
this common limiting value along the line x =. Returning to the above defined 
sequence {hx}, we see that for a <x» <b we have 4hoKi(xo, xo)ho=limz.. J [he] 
20 for arbitrary vectors ho. Hence, the matrix Ki(x, x) is positive semi- 
definite on a <x <b, and by continuity this property holds on the closed inter- 
val ab. The symmetry of this matrix follows from (4.4). 

Clearly the above result applies to any positive semi-definite kernel matrix 
Ki(x, t) such that Ki(x, t) = Ki(t, x), and which possesses the continuity prop- 
- erties described immediately preceding the lemma. 


THEOREM 4.3. For an H-definitely self-adjoint system (2.1) the matrix B(x) 
must satisfy the condition BB =0 on ab; in particular, the rank of B(x) at any 
point of this interval cannot be greater than (n/2), the largest integer not exceeding 
the value n/2. 


Since for an H-definitely self-adjoint system we have K,(x, 1) 
= S(x)G(x, t)B(t), and as G(x, x—)—G(x, x+) =I on ab (see, for example, 
[1, p. 578]), we have from Lemma 4.2 that 0=Ki(x, x—)—Ki(x, x+) 
= S(x) B(x) = T(x) B(x)B(x) on ab. As T is nonsingular, it then follows that 


390 W. T. REID . [November 


BB=0 on this interval. Algebraically, it is readily seen that this condition 
implies that at each point of ab the rank of B cannot exceed [n/2]. 

This result, which at first notice seems remarkable, first occurred to the 
author in considering the results which will be presented in §8. Indeed, be- 
cause of this relatively strong condition imposed upon B(x) by the H-definite 
self-adjoint character of (2.1), one might conclude that this class of boundary 
value problems is too restrictive to be of great significance. That this is not so, 
however, is borne out by the fact that this class of problems includes those of 
the type discussed in §10. Moreover, the additional results obtained in §7 
concerning systems which are definitely self-adjoint in the sense of Bliss also 
show the significance of such problems. 

5. Properties of H-definitely self-adjoint systems. We shall now proceed 
to establish some fundamental properties of systems (2.1) which are H-defi- 
nitely self-adjoint. 


THEOREM 5.1. All the characteristic values of an H-definitely self-adjoint sys- 
tem are real, and the corresponding characteristic functions may be chosen:real. 


Suppose \=Ai+(—1)"/"Ae, (A2#0), is a characteristic value of (2.1), and 
y=u+(—1)"%v is a corresponding characteristic solution. Then §=u—(—1)1/2p 
is a characteristic solution of this system corresponding to the complex con- 
jugate value \ of the characteristic parameter. As \#}, it follows from 
Lemma 3.4 that y] =0. Since 


Llu] = — Liv] = s[u] = 0 = s[p], 


the vectors u, v belong to ZL. Consequently, in view of Lemma 3.1, 
Hj; y ]=H[u] +H[v]. It then follows from condition (iii)’ and Lemma 
3.2 that Bu=0, Bv=0 on ab; that is, u and v are individually solutions of 
(2.1) for \=0. It is then a consequence of Theorem 4.1 that u=0, v=0, which 
is a contradiction to the assumption that y=u+(—1)"/%y is a characteristic 
solution for the value \. Hence all the characteristic values of an H-definitely 
self-adjoint system are real, and because of the reality of the coefficients of 
such a system the corresponding characteristic solutions may be chosen real. 

In the future, when we speak of a characteristic solution of an H-definitely 
self-adjoint system, it will be understood that this solution is real. 


THEOREM 5.2. If X is a characteristic value of an H-definitely self-adjoint 
system, and y a corresponding characteristic solution, then H{|y]>0 and [2ySy dx 
has the sign of . : 


Since, by Theorem 4.1, \=0 is not a characteristic value, for a character- 
istic solution y of (2.1) we have By0 on ab, and hence H[y]|>0 by (iii)’. 
The rest of the theorem is an immediate consequence of (2.5). 

Let Y(x, \) be a matrix whose columns are 1 linearly independent solu- 
tions of the differential equations of (2.1), and whose elements are perma- 


= 


1942] . SELF-ADJOINT BOUNDARY PROBLEMS 391 


nently convergent power series in \. Such a matrix is determined, for example, 
by the initial condition Y(a, \) = J. By definition, the multiplicity of a charac- 
teristic value of (2.1) is equal to its multiplicity as a zero of the characteristic 
determinant | MY(a, \)+NY(b, d)| , which is a permanently convergent 
power series in \. The index of X as a characteristic value of (2.1) is equal 
to the number of corresponding linearly independent solutions of this system. 


THEOREM 5.3. For an H-definitely self-adjoint system (2.1) the index of a 
characteristic value is equal to its multiplicity. 


The proof is the same as that of Theorem 10 in Bliss [1], down to the 
last equation on page 572. On the assumption that the result of the theorem 
is not true, this equation states that there exists a characteristic solution y 
of (2.1) such that /?ySy dx=0. This, however, is impossible in view of the 
above Theorem 5.2. 


THEOREM.5.4. If the components of f are continuous on ab, and the condition 


(5.1) [ = 0 


is satisfied by every characteristic solution y of an H-definitely self-adjoint sys- 
tem, then this relation is also satisfied by every vector y of L?. 


In view of the preceding theorem, the condition (5.1) for all characteristic 
solutions of (2.1) implies, as in the proof of Theorem 11 of Bliss [1], that the 
nonhomogeneous system 


(5.2) = + Bf, —-s[y] = 0, 
has a solution y(x, \) of the form 
(5.3) y(x, = uo(x) + a(x) + Mm 


the components of u,(x), (u=0, 1, --- ), are of class C’ on ab, and this series 
converges absolutely. and uniformly on any region of the form aSx3b, 
|r| Sp. Moreover, if we write u_;(x) =f(x) and v,(x) =T(x)u,(x), (u=—1, 0, 
1,---), then 

(5.4) Lu] = s[u,] = 0, 

(5.5) =  #t{v,] = 0, 


In view of the boundary conditions we also have 


f dx -f UpStUy,—1 


b b b 
f dx = f Bu, dx = f dx, 


As 


W. T. REID , [November 


b 6 
f de = f Buys dz = tir T dx, 


we have 
Uy+1] H [u,; Uy}, = 1,2,-- 


By Lemma 3.1 it also follows that H[u,; u,]=H[u,; u,]. Now set 
W, = H[wo; p=0,1,-- 
By the above relations we have 
= u,|, w,v=0,1,-- 
and it results from Lemma 3.3 that 
(5.6) [Wa]? = wen]? S Woy-2W w=1,2,---. 


Writing the differential equations of (5.2) in integral form, and employing 
the uniform convergence of (5.3) in a region of the form aSx <b, || Sp, it 
follows that [y] is a permanently convergent power series in \ given by 


Le [uo] + [ui] + + + 
= 


(5.7) 


From its specific form,.it is seen that the series (5.7) has convergence proper- 
ties of the sort indicated above for the series (5.3). Consequently, the series 


(5.8) Wot Wat---, 


the first of which is equal to H[uo; y], are permanently convergent power se- 
ries in X. 

If W2+0 it follows from (5.6) that W2,0, (u=1, 2, -- +), and the second 
series of (5.8) is seen to diverge for \=(W:2/W,)'/*. Hence the permanent con- 
vergence of this series is possible only if 0= W2=H[u:]. Condition (iii)’ then 
implies that Bu; =0 on ab. Moreover, by Lemma 3.3 we have H[y; u,]=0 for 
arbitrary vectors y of L. As <[u:]=Buo, we may also state this condition as 


b b 
(5.9) o-f ax = f ySuo dx 


for arbitrary vectors y of L. In particular, for. y=uo we have 


b 
(5.10) f UpSuo dx = 0. 


Now suppose y is any vector belonging to the space L*, and g(x) is a vector 
of L such that <[y]=Bg. By Lemma 3.1 we then have 


392 
| 


SELF-ADJOINT BOUNDARY PROBLEMS 


0 = Hly; — 9] = f f uoSg dx 


-f de, 


in view of (5.9) and the fact that g is a vector of L. This completes the proof 
of Theorem 5.4. : 

The above result for H-definitely self-adjoint systems is somewhat weaker 
than the corresponding result for definitely self-adjoint systems (see Bliss [2, 
Theorem 2.3]). Formally, this is true because the permanent convergence of 
the second series of (5.8) does not imply that the constant term W, of this 
series is equal to zero; the failure to obtain this latter result is in turn a con- 
sequence of the fact that we do not have an inequality of the form (5.6) for 
u=0. If the convergence of the second series of (5.8) were to imply the vanish- 
ing of Wo, by the argument used above we could proceed to show that the 
hypotheses of the above theorem imply the relation (5.1) for arbitrary vectors 
of the space L instead of merely for the vectors belonging to L*. That the 
result of the above theorem cannot in general be thus strengthened, however, 
is shown by the following example. 


Consider the system 
(5.11) 
ys(0) — yo(0) = 0, + y2(1) = 0, 


where b(x) is a continuous function not identically zero on 0 Sx 31, and such 
that 


(5.12) fo dx = 0. 


It may readily be verified that this system is H-definitely self-adjoint with 
the matrix 


(5.13) T= | 


and, moreover, this system has no characteristic values. For this system, 
therefore, the condition of Theorem 5.4 that (5.1) hold for every characteris- 
tic solution imposes no additional restriction on a vector f =(fi(x), fe(x)) with 
continuous components. Now 


yi(x) = 1, ye(x) = 1 — (2 / fo it) vo dt 


1942) 393 
-1 0||’ 


394 W. T. REID ' [November 


is a vector of the space L for this problem, and for this particular y we have 


1 1 1 
‘Sy dx = 1(x)b dx = 1(x)b dx. 
J fSy dx f frlx)b(x) de f frla)b(2) dx 


For certain continuous functions f;(x), in particular, for f(x) =0(x), this ex- 
pression is different from zero. Thus we see that in the statement of Theorem 
5.4 the phrase “every vector y of L®” cannot in general be replaced by “every 
vector y of L.” 

We shall now proceed to establish as corollaries to the above theorem cer- 
tain related results. 


Coro.iary 1. If the system (2.1) is H-definitely self-adjoint and f(x) is a 
vector of the corresponding space L for which condition (5.1) is satisfied by every 
characteristic solution y of the system, then this condition is also satisfied by every 
vector y of L; in particular, [°fSf dx =0. 


Let g(x) be a vector with continuous components such that Lf] =Bz, 
s|f]=0. For a characteristic solution y corresponding to a characteristic 
value A we have 


b b 
and thus the condition that f satisfies (5.1) with every characteristic solution y 
implies that the vector g satisfies a like condition. It then follows from Theo- 
rem 5.4 that [?y*Sg dx =0 for arbitrary vectors y* of L*. Now for an arbitrary 
vector y of L let y* denote the vector of L* such that <[y*]=By, s[y*]=0. 
Then 


0 -f y*Sg dx = H[y*; f] = H[f; -f fSy dx, 
so that the conditions of the corollary imply (5.1) for arbitrary vectors y 
of L. Since f(x) belongs to L, we have, in particular, {?fSf dx =0. 


COROLLARY 2. If the system (2.1) is H-definitely self-adjoint and f(x) is a 
vector of the corresponding space L* for which condition (5.1) is satisfied by every 
characteristic solution y of the system, then B(x)f(x)=0 on the interval ab. 


Let g(x) be a vector of L such that C[f]=Bg, s[f]=0. Then by an argu- 
ment similar to that used in the proof of Corollary 1 we have /?ySg dx =0 
for every characteristic solution y and hence, by Theorem 5.4, this condition 
also holds for arbitrary vectors y of L?. In particular, for y=f we have 


o=f = 


and in view of (iii)’ we have Bf=0 on ab. 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 395 


CoroLtary 3. If for an H-definitely self-adjoint system the condition 
B(x)y(x)=0 on ab holds for a vector y of L if and only if y(x) =0 on this interval, 
then if the components of f(x) are continuous and condition (5.1) is satisfied by 
every characteristic solution of the system it follows that B(x)f(x)=0 on the 
interval ab. 


In the proof of Theorem 5.4 it was established that the vector u of L 
defined by (5.3) satisfies Bu, =0 on ab. Under the strengthened hypotheses 
of the corollary we consequently have u;=0, and it then follows from (5.4) 
for u=1 that Buo=0 on ab. As up is also a vector of L it in turn results that 
uy =0, and hence B(x)f(x) =0 on ab by equation (5.4) for u=0. 

For an H-definitely self-adjoint system the additional hypothesis of the 
above corollary is clearly equivalent to the following: H|y]>0 for every non- 
identically vanishing vector y of L. 

6. Existence of characteristic values. In general an H-definitely self-ad- 
joint system (2.1) does not possess an infinity of characteristic values. In 
particular, the example (5.11) of the preceding section illustrates the possibil- 
ity that such a system may have no characteristic values. It is also easy to 
construct examples of such systems that have only a finite number of charac- 
teristic values. We shall, therefore, consider in this section the possible char- 
acter of the totality of characteristic values of an H-definitely self-adjoint 
system. 

Since for such a system the characteristic values are the zeros of a perma- 
nently convergent power series, and the index of each characteristic value is 
equal to its multiplicity, it follows that there can exist at most a denumerable 
infinity of characteristic values. Let { yu, Au}, (u=1, 2, +--+), denote a maxi- 
mal set of linearly independent characteristic solutions and corresponding 
characteristic values, the former chosen orthonormal in the sense that 


|r. | 
(6.1) VS Yr dx = =1,2,---, 


where 6,,=0 if uv, 5,,=1. Such a choice is possible in view of Theorem 5.2. 


THEOREM 6.1. A necessary and sufficient condition that an H-definitely self- 
adjoint system have at least k linearly independent characteristic solutions is that 
the quadratic functional H|y| be positive definite on a linear subspace of L* of 
dimension k; that is, that there exist vectors f,(x), (u=1,---,k), of L? such that 
for arbitrary constants (d,, -- + , dx) #(0, ++ - , 0) the vector f(x) =fi(x)di+ --- 
+fi(x)d, renders H[f|>0. 


For suppose that y,(x), (u=1,---, &), are linearly independent charac- 
teristic solutions of such a system, and that these solutions are chosen ortho- 
normal in the sense of (6.1). If \, denote the characteristic value correspond- 
ing to y,, then for f, and arbitrary constants (d;, - - - #(0,--- ,0) we 


396 W.T.REID . [November 


have for each vector f=fidit - that H[f]=|a|di+ --- + || 
Hence the condition of the theorem is necessary. 

In order to prove the sufficiency of the theorem, suppose that there exists 
a linear subspace of L? determined by vectors fi, ---, f, on which H[y] is 
positive definite, while the system (2.1) has fewer than & linearly independent 
characteristic solutions. It would then follow that there exists a set of con- 
stants d;,---~ , d not all zero and such that the vector f=fidi+ --- +fidk 
satisfies equation (5.1) with every characteristic solution y. Since f belongs to 
L?, it is then a consequence of Corollary 2 to Theorem 5.4 that Bf=0, and 
hence H|f]=0 also, contrary to the assumption of the positive definite char- 
acter of H[y] on the linear subspace of L? determined by fi, - - - , f;. Hence 
the condition is also sufficient. 

We shall now give a particular sufficient condition for an H-definitely 
self-adjoint system to have an infinity of characteristic values. This condition 
has application for the special boundary value problem of §10. Suppose that 
the matrices A(x) and B(x) satisfy the following condition. 

(v) There is a subinterval a,b, a <a, <b; <b, of ab such that if aj, b{ are 
arbitrary values satisfying a, <a{ <b{ <b, then there exists a vector g of L 
and associated y of L? satisfying <[y]= Bg, By 40 on aj bj , whereas y=0 out- 
side the given interval a by. 


THEOREM 6.2. If an H-definitely self-adjoint system satisfies condition (v), 
then this system has infinitely many characteristic values. 


For consider an interval a,b; on which the condition (v) is satisfied, 
and for a given integer k divide a,b; into k non-overlapping subintervals 
Ai, ---, Ax. Let y=f, denote a vector of L* satisfying the conditions of (v) 
relative to A,, (u=1, --- , k). Since Bf, #0 on A,, and f, =0 outside this inter- 
val, it follows readily that H[f,]>0, H[f,; f,]=0 for (u; v=1,---, k). 
Consequently, for each f=fidit+ --- +fide we have H[f]=H[f,] G+ --- 
+H|[f.] d, and by Theorem 6.1 the system (2.1) has at least & linearly in- 
dependent characteristic solutions. Since k may be chosen arbitrarily, such a 
system has infinitely many characteristic values. 

Corresponding to a vector f we shall denote by c,[f] the Fourier coeffi- 
cients 


w=1,2,---. 


Clearly these coefficients are well-defined for a vector f whose components 
are merely integrable on ab. 


Lemma 6.1. If A}, (u=1, 2,---), denote a maximal set of linearly 
independent characteristic solutions and corresponding characteristic values for an 
H-definitely self-adjoint system (2.1), the former orthonormal in the sense of 
(6.1), then for an arbitrary vector f of L, 


f 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 


If f belongs to L, then for arbitrary integers k the vector f —)o.<i¥p(x)c,[f] 
is also in L, and 


0< al = - 


7. Definitely self-adjoint systems. In this section we shall consider sys- 
tems (2.1) that are definitely self-adjoint in the sense of Bliss. A maximal set 
of linearly independent characteristic solutions and associated characteristic 
values for such a system will again be denoted by {y,(x), \,},(u=1, 2, ---); 
moreover, we shall assume that the former are chosen orthonormal in the 
sense that 


b 
(7.1) f dx = Sy», 1, 2,°°° 
We also write 
b 
(7.2) elf] -f Sy, dx, 1, 2, 


for the Fourier coefficients of a vector f(x). It then follows from Theorem 3.1 
of Bliss [2] that for an arbitrary vector f of L the series 


(7.3) o(x) = yu(x)e,[f] 


converges absolutely and uniformly on ab; moreover, B(x) [f(x) —¢(x) ] =0 on 
this interval. 


THEOREM 7.1. If (2.1) is definitely self-adjoint, then for arbitrary vectors f 
of L we have 


The uniform convergence of the. series (7.3) permits the evaluation of 


Alf) as 
= f gSf dx 


= sseae 
= 


397 
| 


398 W. T. REID : [November 


where g is a vector such that Lif] = Bg; the last relation above is a conse- 
sequence of the readily established equality e,[g]=),e,[f]. Similarly 


THEOREM 7.2. Suppose that (2.1) is definitely self-adjoint and that the char- 
acteristic values of this system are bounded below; moreover, let the set {y,(x), du} 
be so ordered that): SoS «+ - . If Ci denote the totality of vectors f of L satisfying 
S?fSf dx=1 and Cy is nonvacuous, then d, is the minimum of H[f] in this class; 
moreover, this minimum is attained by a particular f of Ci if and only if 
f= Vi(x) +(x), where Y; is a characteristic solution for \=), and ®, is a vector 
of L such that B®, =0. In general, if 1, , exist, denote by C,, the totality 
of vectors f of L satisfying 


ff ssfax=1, alfl= f fSy, dx = 0, 


If this class is nonvacuous, then \m exists and is the minimum of H[{f] in Cn; 
moreover, this minimum is attained by a particular f of C, if and only if 
f= Vin(x)+Pm(x), where Ym is a characteristic solution for X=» and ®,, is a 
vector of L such that B®,,=0 on ab. 


The relations (7.4) clearly imply H[f] 2; in C; whenever this class is non- 
vacuous; furthermore, if then the equality sign holds 
if and only if ¢,[f] =0, (u=q+1, ---). If Vile) 
then (x) =f(x) — Yi(x) belongs to L and e,[#:]=0, (u=1, 2, - - - ). Hence by 
the second equation of (7.4), and the definiteness of S we have B®, =0 on ab 
(see also Bliss [2, Corollary 2.2]). In general, if Xx, -- +, Am-1 exist and the 
class C,, is nonvacuous, it again follows from (7.4) that \,, must exist and 
H[f] 2m in this class. Moreover, if =Am+p<Am+p41, then the 
equality sign holds if and only if e,[f]=0 for u>m+p. If we now define 
then &,=f—Y,, is a vector of L 
such that ¢,[%,]=0, (u=1, 2,---). Then, as above, it follows that B®, =0 
on ab. 


THEOREM 7.3. If (2.1) is definitely self-adjoint and its characteristic values 
are either bounded below or above, then without loss of generality this system may 
be taken to be H-definitely self-adjoint; moreover, in this case BB =0 on ab, and 
the rank of B(x) does not exceed [n/2] at any point of this interval. 


Suppose that the characteristic values are bounded below, and let A» be a 
number less than the smallest characteristic value, \:. Then for an arbitrary 
vector f of L the functional 


\ 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 


may be written, in view of (7.4), as 


= Oy — 


Consequently, for f belonging to L we have H[f:\o] 20, and the equality sign 
holds if and only if e,[f]=0, (u=1, 2, - - - ). For a definitely self-adjoint sys- 
tem, however, this condition implies Bf =0 on ab in view of the second equa- 
tion of (7.4) (see also Bliss [2, Corollary 2.2]). The replacement of H[f] by 
H[f:Xo] is equivalent to a linear change of parameter in the boundary value 
problem (2.1). Hence for a definitely self-adjoint problem whose characteris- 
tic values are bounded below we may without loss of generality assume that 
the functional H[y] satisfies the definiteness property (iii)’; that is, that the 
system is H-definitely self-adjoint. By Theorem 4.3 it then follows that 
BB=0 and the rank of this matrix does not exceed [n/2] on ab. 

In case the characteristic values of (2.1) are bounded above then the re- 
placement of \ by —A, or the equivalent replacement of B(x) by —B(x), 
transforms the given system into one whose characteristic values are bounded 
below. The original system being definitely self-adjoint with T implies that 
the new system is definitely self-adjoint with — 7. Hence, by a linear change 
of parameter and the replacement of T by —T the given system is reducible 
to one which is H-definitely self-adjoint and the results of the theorem follow 
from the preceding case. 

In this connection, it is worthwhile to point out that certain specific repre- 
sentations of “equivalent” boundary value problems may have individual 
advantages. For example, consider the boundary value problem y’’+Ay=0, 
y(0) =0=y(a). A maximal set of linearly independent characteristic functions 
and associated characteristic values is {sin mx, ?}, (n=1, 2,---). If we 
write this problem as yi = —Ay1, ¥1(0) =0 =41 (77), then this system is 
definitely self-adjoint and also H-definitely self-adjoint with the matrix 
(5.13). On the other hand, yi =pye2, = —py1, is “equiva- 
lent” to the given problem by setting \=p*. This latter system is definitely 
self-adjoint with (5.13), but is clearly not H-definitely self-adjoint with this 
or any other matrix T since the corresponding matrix B(x) is nonsingular. 

The following result is an immediate consequence of Theorem 7.3. 


THEOREM 7.4. If (2.1) is definitely self-adjoint and B(x) B(x) 40 on ab, then 
this system has infinitely many negative, and also infinitely many positive, char- 
acteristic values. 


This theorem contains as a very special case the result of Theorem 4.3 
of Reid [10]. To see this, suppose that B(x) has constant rank »—m on ab and 
denote by 7:=7ia(x), (a=1, +--+, m), linearly independent solutions of the 
equations B(x)r=0. From (2.4) it follows that T-'B=BT-, and hence the 
rank of T7(x)B jx(x)|| is the same as the rank of || +ia(x) Ba(x)T7}(x)||, 


399 


400 W.T.REID [November 


which in turn is equal to the rank of || ar sa (2x) B;,(x)||. Now the rank of this 
latter matrix is clearly equal to m if BB=0, and hence the hypotheses of 
Theorem 4.3 of [10], which demand that the rank of this matrix exceed m at 
some point xo of ab, require that BB #0 on ab. It is also to be noted that in 
the above referred to theorem of [10] it was not proved that the system had 
infinitely many characteristic values of each algebraic sign, but simply that 
the system had infinitely many characteristic values under the stronger hy- 
potheses there stated. 
In view of relations (7.4) we have the following result. 


THEOREM 7.5. A definitely self-adjoint system (2.1) has at least k characteris- 
tic values if and only if [2fSf dx is positive definite on a linear subspace of L of 
dimension k. Moreover, for a given constant Xo such a system has at least k char- 
acteristic values greater [less] than Xo if and only if the functional H[f:do] is 
positive [negative] definite on a linear subspace of L of dimension k. 


In the case of a definitely self-adjoint system for which the matrix B(x) 
has constant rank on ab the first part of this theorem was deduced by Reid 
[10, Theorem 4.1] from known results for an auxiliary problem associated 
with the calculus of variations. Analogues of the above theorem for H-defi- 
nitely self-adjoint systems are contained in Theorem 6.1 and Theorem 9.3. 

8. A special definitely self-adjoint system. Suppose now that the bound- 
ary value problem (2.1) satisfies conditions (i), (ii) and (iv) of §2 with a 
matrix T(x). In this section we shall consider the associated system 


(8.1) Liy] = »Bi(x)y, s[y] = 0, 


where B,(x) = B(x)T (x) B(x) = B(x)S(x). This problem is seen to be definitely 
self-adjoint with the same matrix T(x). In the first place, in order to show that 
(8.1) is self-adjoint with T it remains only to show that TB,+B:T =0, and 
this is true since TBTB+BTBT=(TB+BT)S=0 by (2.3). If we set 
Si(x) = T(x) Bi(x) = S(x)S(x), clearly conditions (ii) and (iii) are satisfied by 
S;. Finally, since Byy=0 implies yTByy=ySSy=0, and hence Sy=0 and 
By=0, condition (iv) for (2.1) implies the corresponding condition for (8.1). 

Since a definitely self-adjoint problem has at most a denumerable infinity 
of characteristic values, for the consideration of (8.1) one may assume without 
loss of generality that \=0 is not a characteristic value of this system. If 
this condition is not true for the problem as written, it is attainable by a linear 
change of parameter. We shall make this assumption in the following discussion. 

If y is a characteristic solution of (8.1) corresponding to a value X, set 
u(x) = S(x)y(x). In view of condition (iv) for (8.1) we have «#0 on ab. Then 
L[y]=Bu, s[y]=0, and if G(x, t) denotes the Green’s matrix for the incom- 
patible system C[y]=0, s[y]=0, we have 


‘y(x) = Ke, t)u(t) dt, 


1942] SELF -ADJOINT BOUNDARY PROBLEMS 401 


where, as in §4, K(x, ¢) =G(x, t)B(t). In particular, it then follows that u(x) 
is a characteristic solution, for this same value of \, of the linear vector in- 
tegral equation 


(8.2) u(x) = af K,(x, t)u(t) dt, 


where, again as in §4, we have written Ki(x, ¢) =S(x)K(x, t). It also follows 
from the comment after equation (4.4) that K(x, t)=Ki(t, x), and hence 
(8.2) is of the type covered by the Hilbert-Schmidt theory of linear integral 
equations. Conversely, if u is a characteristic solution of (8.2) for a value X, 
and y is.defined as the corresponding unique solution of [y]=ABu, s[y]=0, 
it follows that u(x) =.S(x)y(x) and y is a characteristic solution of (8.1) for 
the same value of \. Hence, there is complete equivalence between the bound- 
ary value problem (8.1) and the integral equation (8.2). 

We shall denote by {y,, A,}, (@=1, 2, - - - ), a maximal set of linearly in- 
dependent characteristic solutions and corresponding characteristic values of 
(8.1), the former chosen orthonormal in the sense that 


b 
(8.3) f dx = Ber, o,7=1,2,---. 


Correspondingly, {u,=Sy,, A,} is a maximal set of linearly independent char- 
acteristic solutions and corresponding characteristic values of (8.2) satisfying 
dx = 

Finally, if g(x) is a vector whose components are continuous (or of Le- 
besgue integrable square) on ab, and if f is defined by the system ([f]=Bg, 
s[f]=0, it follows from (4.3) that 


-f f g(x) t)g(t) dx dt. 


It is to be emphasized that the above defined vector f belongs to the linear 
vector space L for the problem (2.1), but not necessarily to the corresponding 
space L; for the problem (8.1), since this latter space contains vectors f which 
satisfy with associated vectors g the differential system 


(8.4) Lifl=BSz, s[fJ=0. 


In case the matrix B is nonsingular on ab the space L; is seen to be identical 
with L. However, since in general Z; is a subspace of L and Byy =0 on ab if 
and only if By=0 on this interval, the condition that (2.1) be H-definitely 
self-adjoint clearly implies that (8.1) is also H-definitely self-adjoint. 

The results of Bliss [2], and those of the preceding section, give properties 
of the particular definitely self-adjoint system (8.1) on the space L;. We wish, 
however, to go further and obtain properties of this system on the space L 
corresponding to the boundary value problem (2.1). As pointed out above, 


A 


402 W.T. REID [November 


one may always by a linear change of parameter, replacing \ by a suit- 
able A+Ao, insure for (8.1) that A=0 is not a characteristic value of this 
system. Now this change of parameter is equivalent to replacing A(x) by 


A(x) +AoB(x) S(x). 


Before proceeding further, it is to be emphasized that the space L, as 
defined in §2, is invariant under this operation. This results from the 
fact that for a given vector y whose components are of class C! the 
vector differential equation y’—Ay=Bg is equivalent to the equation 
y’ —Ay—doBSy = Bg, by the transformation g;=g—AoSy. 

Corresponding to a given vector f, we set 


=f idx, 5, [f] = fas = tua, om 


clearly these coefficients are well-defined if the components of f are integrable . 
on ab. Since the vectors u, are orthonormal, the following result is an immedi- 
ate consequence of Bessel’s inequality. 


Lemma 8.1. If the components of g(x) are of integrable square on ab, then 
the series >..62|[g] converges and 


(8.5) Le [¢] sf gg dx. 


LEMMA 8.2. The series 


(8.6) 


converge on ab; moreover, if >. <+ ©, the vector series 


(8.7) 


converges absolutely and uniformly on this interval. 


Since 


-f K(x, #)u,(t) dt, 


it follows that for a fixed value of x each row of K(x, t) is a vector satisfying 
the conditions of Lemma 8.1. Hence the series (8.6) converges; moreover, in 
view of (8.5), there clearly exists a constant «x such that sum of the series (8.6) 
does not exceed «x in value on ab. If 8 < +o, the absolute and uniform 
convergence of (8.7) on this interval is a consequence of the inequalities 


2 
#=1,2,---,m, 


SELF-ADJOINT BOUNDARY PROBLEMS 


o=mN A, o=mN A, o=mN 
N+h 2 1/2 
«| 


THEOREM 8.1. For an arbitrary vector f of L the series 


(8.8) = ye(x)de[f] 


converges absolutely and uniformly on ab; moreover, B(x) [ f(x) —@(x) ] =0 on this 
interval. For an arbitrary vector h(x) whose components are integrable on ab, 


(8.9) hSf dx = 
particular, 

(8.11) 


If Clf] = Bg, s[f] =0, it follows from (8.1) that A,d,[f] = 6.[g], 
(o=1, 2,---), and the absolute and uniform convergence of (8.8) is a con- 
sequence of Lemmas 8.1 and 8.2. Clearly, ¥(x) =S(x) [f(x) —¢(x)] satisfies 
5.[y]=0, (¢=1, 2, - - - ). We will now show that also 


(8.12) f Kilz, dt = 0. 


Let f* be the solution of <[f*]=B,f, s[f*]=0. Then by Theorem 3.1 of Bliss 
[2] the series #*(x) => .y-(x) de[f*] converges absolutely and uniformly on 
ab, and B,[f*—¢*]=0 on this interval. This latter condition, in view of the 
first paragraph of this section, implies S[f*—¢*]=0. Consequently, since 
A.d.[f*]=d.[f], @=1, 2, -- we have 


b 


the latter relation being verified by direct computation. Finally, as 


b 
= f x) dt — 


it follows from (8.12) and 5,[y]=0, (o=1, 2,---), that dx=0, and 
hence y =0 on ab. In particular, O=7—'S[f—¢] =B[f—¢]. 


1942] 403 


404 W. T. REID [November 


Equations (8.9) and (8.10) are ready consequences of the relationSf = S@ 
and the uniform convergence of the series ¢. Relation (8.11) in turn results 
from (8.9), the conditions A,d,[f]=6,[g], and H[f] = dx. 

In view of (8.10) we also have the following result. 


Coro.iary 1. A vector f of L satisfies [2fSyy dx =0 with every characteristic 
solution y of (8.1) if and only if Bf=0 on ab. 


Corollary 2.2 of Bliss [2], when applied to the system (8.1), would imply 
the result of the preceding corollary for vectors f belonging to L;, instead of 
to the space L. 


Coro.iary 2. If the vector f belongs to L, and f* in turn satisfies L[f*]=Bif, 
s[f*] =0, then f*(x) =D de[f*]. 


In view of the uniform convergence of the series (8.8) associated with f, 
and the relation A,d,[f*]=d,[f], (c=1, 2,---), we have 


A 


ye(x)d, [f*]. 


It is to be mentioned that the above relation Sf =) «tte de[f] 
for a vector f of L, and the subsequent proof of (8.10), (8.11), could have 
been taken directly from the Hilbert-Schmidt theory of integral equations. 
However, by the above treatment we have proved more; namely, the absolute 
and uniform convergence of the series (8.8) involving the characteristic solu- 
tions of the considered boundary value problem (8.1). 

If the characteristic values of (8.1) are bounded below, for this particular 
problem the result of Theorem 7.2 may be strengthened in that the space L 
for the problem (2.1) may essentially be substituted for the space L; belonging 
to (8.1). For suppose that the characteristic values are bounded below and 
that { Yer A,} are so ordered that Ai. SAeS ---. If I, denote the totality of 
vectors f of L satisfying /2fSif dx=1 and T; is nonvacuous, it follows from 

- (8.10), (8.11) that A; is the minimum of H[f] in T'; moreover, in view of Cor- 
ollary 1, it follows by an argument similar to that used in the proof of Theorem 
7.2 that this minimum is attained by a particular f if and only if f= Y¥1+4, 
where Y; is a characteristic solution of (8.1) for \=A; and ®, is a vector of L 
such that B®,=0. In general, if Ai,--+, Am: exist for (8.1), let T,, de- 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 405 


note the totality of vectors f of L satisfying /?fS,fdx=1, d,[f]=0, 
(o=1,-++, m—1). If this class is nonvacuous, we have by a correspond- 
ing argument that A,, exists and is the minimum of H[f] in I’,,; moreover, 
this minimum is attained by a particular f of I, if and only if f is of the 
form Y,,+9,, where Y,, is a characteristic solution for \=A,, and ®,, is a 
vector of L satisfying B®,,=0 on ab. 

From the above Corollary 1 and equation (8.11) we deduce the following 
theorem. 


THEOREM 8.2. A system (2.1) which satisfies conditions (i), (ii) and (iv) of §2 
also satisfies condition (iii)’, and is consequently H-definitely self-adjoint, if and 
only if the corresponding system (8.1) has no characteristic values A satisfying 


It is to be emphasized that there is no assurance under the conditions of 
this theorem that the system (8.1) shall have any characteristic values. For 
example, the system 


is not only definitely self-adjoint (Bliss [2, p. 427]), but also H-definitely 
self-adjoint with the matrix (5.13), whereas this system has no characteristic 
values. Moreover, the corresponding system (8.1) is identical with the given 
system and thus possesses no characteristic values. 

If an H-definitely self-adjoint system has k linearly independent charac- 
teristic solutions it is a consequence of Theorem 6.1 and formula (8.11) that 
the corresponding system (8.1) has at least & linearly independent character- 
istic solutions. In general, however, when (2.1) is H-definitely self-adjoint 
the corresponding system (8.1) may have more linearly independent charac- 
teristic solutions than the original system. To illustrate this possibility, con- 
sider the example (5.11) where, as in §5, it is supposed that /$b(t) dt=0. The 
corresponding system (8.1) is 


yi = 0, yi = — AB(x)y1, 
yi(0) — y2(0) = 0, yi(1) + yo(1) = 0, 


and this system is seen to have the single characteristic value A =2//}b?(t) dt 
of index one, whereas the original system (5.11) has fo characteristic values. 


THEOREM 8.3. For an H-definitely self-adjoint system (2.1) there exists a 
constant d>0O such that the inequality 


(8.13) f {Sif dx 


holds for arbitrary vectors f of L. 


406 W. T. REID [November 


If the corresponding system (8.1) admits of characteristic values, then in 
view of Theorem 8.2 and the minimizing properties of the characteristic val- 
ues, inequality (8.13) holds for d the reciprocal of the smallest characteristic 
value. If (8.1) possesses no characteristic values, then Bf=0 on ab for every 
vector f of L, the two integrals appearing in (8.13) are individually zero, and 
in this case d may be chosen as an arbitrary positive number. 

Since for an H-definitely self-adjoint system (2.1) the elements of the 
matrix K(x, ¢) are continuous on aSx, tb, and the quadratic functional 
(4.3) is positive semi-definite for arbitrary vectors g, we have the following 
theorem of Mercer (Mercer [8]; also, for example, [3, p. 456]). 


THEOREM 8.4. If (2.1) is H-definitely self-adjoint the series 
Uio( X)U jo(t 
(8.14) i,j =1,2,---,m, 
converges absolutely and uniformly on aSx, tSb and has the sum Kyj;(x, t). 


9. Further results for H-definitely self-adjoint systems. The conclusions 
of the previous section will now be used in the proof of additional results for 
an H-definitely self-adjoint problem (2.1). For such a problem let C, denote 
the totality of vectors f of L satisfying the condition 


(9.1) =. 


THEOREM 9.1. If for an H-definitely self-adjoint problem (2.1) the class Cy 
is nonvacuous, then this system possesses positive characteristic values ; moreover, 
the smallest positive characteristic value is the minimum of H{f\ in the class Cy. 


If the class C, is nonvacuous, let 4; denote the greatest lower bound of 
H[f] in this class; it then follows that 


H[f:a:] = an [spas > 0 


for arbitrary vectors f of L. In view of relation (2.5) for a characteristic solu- 
tion, there is clearly no positive characteristic value of (2.1) less than 4a. 
Hence we have only to prove that 4; is a characteristic value. 

Theorem 9.1 will be established by indirect argument. Let fn=(fim), 
(m=1, 2,+-+-+), be a sequence of vectors of C, such that limnm.. H[fm]=21; 
on the assumption that C, is not empty such a sequence exists. Now suppose 
that \=2, is not a characteristic value; then there exist unique corresponding 
vectors h»,(x) such that 


(9.2) —AiBhn = Bin, = 0, m=1,2,-°-. 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 407 


Moreover, if G(x, ¢:4:) denote the Green’s matrix for the incompatible system 
Liy] —~1By=0, s[y] =0, we have 


b 
-f A(x, dt, m=1,2,--- 
where H(x, t:41) =G(x, t:41)T-1(t). By an elementary vector inequality, 


norm {hn(x)} < «f 


6 
norm {S(t)fm(t)} dt = f [fmS 


for x a suitable constant depending only upon the bounds of the elements of 
A(x, t:41) on aSx, t<b. By the use of Schwarz’ inequality and Theorem 8.3 
it then follows that 


(9.3) norm {hin(x)} S «{(b — a)dH[fm]} 1/2. 


In particular, since {fn} is a minimizing sequence of C,, the sequence {H[fm]} 
is uniformly bounded and there exists a constant x such that 


b 
(9.4) f [norm {hn(x)}]?dx m = 1,2,-- 


Now set gm(x) =fm(x)+¢ hm(x), (m=1, 2, +++), where c is a real constant. 
Then gm is a vector of L, and 


H[gm:d1] = H[fm2d1] + 2cH + [hm 221] 
= + 2c + hmS fm dx, 


in view of (9.2) and the fact that f,, belongs to the class C;. Now 


< [norm {hm}]- [norm {Sfn}] dx 
[Amhm + fmSifm] dx 


1 
x («1 + dH [fm]), 


by (9.4) and Theorem 8.3. Consequently, since { H[f,.]} is uniformly bounded, 
there exists a constant xz such that 


b 
f hmSfm dx | S ka, m= 1,2,:-- 


Now let c be a value such that 2c+c*xe<0; that is, O>c>—2/ke. As 


408 W. T. REID [November 


limn.o H =0, for m sufficiently large it follows that H[gm:2:] <0, con- 
trary to the definition of 4:. Hence a; is a characteristic value and the theorem 
is proved. 

Let C_, denote the totality of vectors f of L satisfying 


If the matrix B(x) is replaced by — B(x), then the class C_, for the original 
problem corresponds to the class C; for the modified problem; clearly such a 
substitution does not affect the H-definite self-adjointness of the problem. 
Hence we have the following result. 


Coro.iary. If for an H-definitely self-adjoint problem the class C_; is non- 
vacuous, then this system possesses negative characteristic values ; moreover, if 1 
denote the largest negative characteristic value, then —2_, is the minimum of 


H[f] in the class C.1. 


For convenience, in the remainder of this section we shall denote the total- 
ity of positive characteristic values of (2.1) by {am}, (m=1, 2,---), each 
repeated a number of times equal to its multiplicity and ordered so that 
AS%S.-:--. Similarly, inet , (m=1, 2,-- + ), denotes the totality of nega- 
tive characteristic values, each repeated a number of times equal to its multi- 
plicity and the set ordered so that 4.12422 - -- . Corresponding to Am, A—m 
we shall associate characteristic solutions ym, y—m, respectively, such that 
{ym Pnutha (m=1, 2,---), is a maximal set of linearly independent solu- 
tions orthonormal in the sense of (6.1). Clearly either one, or both, of the se- 
quences {ym, Am}, {y—m, 4m} may be vacuous or consist of only a finite 
number of characteristic values and associated characteristic solutions. 

Using the above notation, if 1, - - - , A4s-1 exist we shall denote by C, the 
totality of vectors f of L such that 


b 


Similarly, if A1, - - - , A-~—1) exist the class C_, is defined as the totality of 
vectors f of L satisfying 


b b 
-1, ff ax = 0, 


THEOREM 9.2. If for an H-definitely self-adjoint system the class C, [C_.] 
is nonvacuous, then the characteristic value 2, exists; moreover, is 
the minimum of H{f] in the class C, [C_,]. 


In view of the artifice used in deducing the above corollary, it suffices to 
restrict our attention to the case of positive characteristic values. The result 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 409 


of the theorem might be established by an argument similar to that utilized 
by the author [7] in proving a corresponding result for special boundary value 
problems associated with the calculus of variations. However, the following 
method, which has also been used in considering accessory boundary problems 
of the calculus of variations, seems more elegant. ; 

Consider the auxiliary boundary problem involving »+2(s—1) variables 
(y, v) (ys, Ua, Va), (61, m; a=1, 2,-+-, $—1), and consisting 
of the differential equations and boundary conditions 


yt = 95 + sox) up + 
te = 0, 

Va = Vial Vir 

+ Nizy(b) = 0, 

Ya(a) = 0, 

va(b) = 0, 


where the indices a, 8 range from 1 to s—1. If capital German letters denote 


the matrices of (9.5) corresponding to A, B, M and N of the system (2.1), 
we have 


Ajj is Ois Biz Oe Ov 
W= |) 02; Oop |], Ons 
VieSij Ou; Oap 
Mi; Ov Nii O~ Ov 
M = || Oap Sas |i, N= Oa; 
Oa; Oag 0a; Oap Sap 


This system is seen to be self-adjoint with the matrix 


Ti; O” Om 
T=] Ons |i, 
Sap 


where T'=||T7;,|| is the matrix with which (2.1) is H-definitely self-adjoint. 
Condition (ii) of §2 is seen to be satisfied by this system. Since C[y.]=A.Bya, 
s[y.]=0, (a=1, -- - ,s—1), it also follows readily that if (y, u, v) is a charac- 
teristic solution of (9.5), then u.=0, (a=1, ---,s—1); moreover, for such a 
characteristic solution y #0 on ab. In particular, for \=0 this result implies 
that whenever condition (iv) is satisfied by (2.1) this condition also holds for 
(9.5). Finally, if (y, u,.v) belongs to the corresponding linear vector space 2 
for (9.5), then y belongs to the space L for (2.1); also, for such (y, u, v) the 
corresponding functional H[y, u, v] reduces simply to H[y]. Therefore, con- 


= 


410 W. T. REID [November 


dition (iii)’ for (2.1) implies the corresponding condition for (9.5), and if (2.1) 
is H-definitely self-adjoint so also is the latter system. 

Now if f belongs to the class C, for (2.1), the set {fi, Uq =constant, 
Va= fiya(t)S(t)f(t) dt} belongs to the corresponding class ©; for (9.5); con- 
versely, if (f:, Ua, Ya) belongs to ©; the vector f belongs to C,. In particular, 
C, is vacuous if and only if ©; is vacuous. If ©; is nonvacuous, then by Theo- 
rem 9.1 the minimum of H[y, u, v] =H[y] in this class exists and is the small- 
est positive characteristic value of (9.5). Since, as pointed out above, for a 
characteristic solution of (9.5) we have u =0, y #0, it follows that the smallest 
positive characteristic value of (9.5) is a characteristic value for (2.1). It is 
obvious that the characteristic value thus determined is equal to 4, according 
to the previously introduced notation. 

We are now in a position to derive a result which is complementary to 
that of Theorem 6.1. 


THEOREM 9.3. A necessary and sufficient condition that an H-definitely self- 
adjoint system have at least k positive [negative] characteristic values, where it is 
to be understood that each such value is counted a number of times equal to its 
multiplicity, is that the quadratic functional [2fSf dx be positive [negative] defi- 
nite on a linear subspace of L of dimension k. 


For suppose that positive characteristic values 41, - ++ , 44 exist for such 
a system (2.1), and that yi, - - - , yz are corresponding orthonormal character- 
istic solutions. Then for arbitrary constants (d;,---, d,)#(0,---, 0) we 
have for f=yidit - +yudy that [2fSf dx=di+ -- + On the other 
hand, if /2fSf dx is positive definite on a linear subspace of L of dimension 
k, then the classes C;, --- , C, as defined above are seen to be nonvacuous 
and (2.1) has at least k positive characteristic values. Again, in view of the 
possibility of replacing B by —B, the result for negative characteristic values 
is a ready consequence of the result for positive characteristic values. 

We shall now give a particular condition which is sufficient to insure that 
an H-definitely self-adjoint system (2.1) has infinitely many characteristic 
values of a given sign. We shall denote by (v) the following hypothesis. 

(v4) There is a subinterval a,b, a <a; <b; <b, of ab such that if af, bf are 
arbitrary values satisfying a, Sa{ S;, then there exists a vector g(x) 
and associated f(x) satisfying = Bg for which f=0 outside aj bi , whereas 
dx >0. 

The condition obtained by replacing in the relation “{2fSf dx >0” by 
“ (2¢Sf dx <0” will be referred to as (v_). Using the device of the proof of 
Theorem 6.2, and the result of the above theorem, one obtains the following 
conclusion. . 


THEOREM 9.4. If an H-definitely self-adjoint system (2.1) satisfies hypothesis 
(vs), then this system admits infinitely many positive characteristic values. Simi- 


. 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 411 


larly, if such a system satisfies condition (v_), there exist infinitely many negative 
characteristic values. 


In agreement with our modified notation for the characteristic values and 
solutions of an H-definitely self-adjoint system (2.1), we write 


| 


THEOREM 9.5. For an arbitrary vector f of L, 


Au 


f fSy,dx, w=1,-—1,2,—2,--- 


moreover, if f and h are vectors of L, 


(9.7) [naz 


In view of Theorem 9.2, relation (9.6) is readily seen to be true if (2.1) 
admits only a finite number of characteristic values. We shall prove this rela- 
tion for the case in which this system has infinitely many positive, and also 
infinitely many negative, characteristic values; the modification in the proof 
whenever the system has only a finite number of characteristic values of one 
sign is obvious. 

Corresponding to a vector f of L and a given positive integer k, set 
uy, (x) o,[f]. Then c,[f,]=0, (a= —k, ---, &), and by the mini- 
mizing properties of the characteristic values of (2.1) we have : 


|2, 


if dx20; whereas 

Alf.) 2 dx 20 
if Sf, dx $0. Consequently, 


dx | S max { 


Moreover, since 


0s 


it follows that 


W. T. REID : [November 


0 = lim = fSf dx — lim > 
kv 
Since the series involved in (9.6) clearly converges absolutely, this relation 
is established. Relation (9.7) is then immediate since if f and 4 are vectors 
of L, so also are f+h and f—h, and 


= h)S(f + h)dx -fu- naz, 


elf + hk] = c¢,[f] +c, [A]. 


Coro.iary 1. If f is a vector of L* then the equality sign holds in (6.2), 
that is 


(9.8) all = |a 


Let h be a vector of L such that C[f]=Bh, s[f]=0. Since 4,c,[f] =c,[h], 
(u=1, —1, 2, —2,--- ), this corollary is a consequence of (9.7) and the rela- 
tion H[f]=/2fSh dx. 

It is to be noted that in general we do not have relation (9.8) for arbitrary 
vectors f of the space L. This fact is shown by the example (5.11), in view of 
the comment immediately preceding Theorem 8.3. We do have, however, the 
following result. 


CorOLLary 2. If the class C,, (s=1, —1,2, —2,--+), 4s nonvacuous for an 
H-definitely self-adjoint system (2.1), then the minimum of H[f] in this class 
is attained by a particular f of C, if and only if f= Y,(x)+®,(x), where Y,(x) 
ts a characteristic solution for \=2, and ®, is a vector of L satisfying BS,=0 
on ab. 


By the use of inequality (6.2), relation (9.6), and an argument similar to 
that employed in the proof of Theorem 7.2, we have that if f is a vector of a 
nonvacuous class C, which renders H[f] its minimum value in this class, 
then f= Y,+®,, where Y, is a characteristic solution of (2.1) for \=2, and ®, 
is a vector of L such that c,[,]=0, (u=1, —1, 2, —2,- - - ). It then follows 
from Corollary 1 to Theorem 5.4 that [26,Sy dx =0 for every vector y of L 
and thus, in particular, [26,S, dx=0. Then 


0 = Alf; a.] = + 24 Ve: + 
= = H[4.], 
and hence B(x)®,(x) =0 on ab by condition (iii)’. 


THEOREM 9.6. If (2.1) is H-definitely self-adjoint and we write v,(x) 
= S(x)y,(x), (u=1, —1, 2, —2,---), then each of the series 


412 


1942] ‘ SELF-ADJOINT BOUNDARY PROBLEMS 
[vin(x) 

> 


converges and its sum does not exceed K1;;(x, x) on ab. For fixed values of one of 
the variables x, t, each of the series 


(9.9) #=1,2,---,m, 


> Vin( V 


(9.10) i,j =1,2,--+,m, 


| | 
converges absolutely and uniformly in the other variable on ab. Finally, for an 
arbitrary vector f of L the vector series 


(9.11) | 


converges absolutely and uniformly on ab. 


Corresponding to an arbitrary vector g(x), define f by C[f]=Bg, s[f]=0. 
Then for an arbitrary integer k, 


osu[s- D voll 


lwisk 


(9.12) 


Applying the argument of Lemma 4.2 to the double integral of (9.12), it fol- 
lows in particular that 


[vin( x) ]? 


for each integer k. Hence the series (9.9) converges and its sum does not ex- . 
ceed Kiis(x, x). Since the sum of this series is uniformly bounded on ad, 
Cauchy’s inequality insures that each of the series (9.10) converges absolutely 
and uniformly in each of the variables on ab for fixed values of the other vari- 
able. In particular, each of these series defines a function which is continuous 
in each of the variables x, ¢ separately on ab. Since for an arbitrary f of L the 
series >. du| c2[f] converges, the absolute and uniform convergence of (9.11) 
is a consequence of the uniform boundedness of the sum of the series (9.9) on 
ab and Cauchy’s inequality. 

The proof of the above convergence peaparthie of (9.9), (9. 10) parallels 
that of corresponding results (see, for example, [3, p. 456]) used in establish- 
ing Theorem 8.4 for the boundary problem (8.1). We are unable to prove for 
(2.1) a result as general in character as Theorem 8.4 gives for system (8.1), 
however, since for an H-definite self-adjoint system we do not in general 


S x) 


| 


414 W.T. REID [November 


have that relation (9.8) is valid for arbitrary vectors f of L. When this latter 
condition is fulfilled for a particular H-definite self-adjoint problem it then 
follows that the sum of the series (9.10) is Ki:;(x, ¢); in particular, the sum 
of (9.9) is Ki:;(x, x), by Dini’s theorem the convergence of this latter series 
is uniform on ab, and it then follows that the series (9.10) converges absolutely 
and uniformly in (x, ¢) jointly. 


THEOREM 9.7. For a vector f of L relation (9.8) holds if and only if 


(9.13) Safa) = = 


From the above theorem we know that the right-hand member of (9.13) 
converges absolutely and uniformly on ab, and hence defines a vector whose 
components are continuous on this interval. If <[f]=Bg, s[f]=0, and rela- 
tion (9.13) holds for f, then 


Toll f dx = | ex Lf], 


since c,|g] =2,c,[f]. On the other hand, if we define f(x) = f(x) 
c,[f], relation (9.8) is equivalent to the condition lim,... H[fx| 
=0, whereas (9.13).is equivalent to F(x) =limz... S(x)fi(x) =0. In view of 
relation (8.13) for f; it then follows that if (9.8) holds for a vector f then the 
associated vector F is identically zero, that is, relation (9.13) is also valid. 

Since the matrix T is nonsingular, it is clear that relation (9.13) holds for 
a particular f if and only if the series },,B(x)y,(x)c,[f] converges absolutely 
and uniformly on ab, and 


If f is a vector of L?, then the series c,[f] con- 
verges absolutely and uniformly on ab, and B(x) [f(x) —(x) | =0. 

Corollary 1 to Theorem 9.5 and Theorem 9.7 implies that relation (9.13), 
and hence (9.14), holds for such an f. We have, therefore, only to prove the 
absolute and uniform convergence of the series ¢. Let h(x) be a vector 
of L such that C[f]=Bh, s[f]=0. Now since h belongs to L, the series 


[h] cuLf] converges absolutely and uniformly 
on ab. Hence the corresponding convergence of ¢ is a consequence of 


= { "ole, t) B(d)y, (2) ath 


= "G(x, t) at 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 415 


In conclusion, we shall prove the following general expansion theorem. 
THEOREM 9.8. If for an arbitrary vector f the series ).,B(x)y,(x) c,[f] con- 
verges uniformly and satisfies relation (9.14) on ab, then for f*(x) defined by the 


system C[f*|=Bf, s[f*]=0, the series cy[f*] converges uniformly on 
this interval to f*(x). 


This result is an immediate consequence of the relations 


b 
f*(x) f G(x, t)B(t) f(t) dt 


{f “Gls athe, 


10. A boundary problem of the calculus of variations. We shall now con- 
sider a system of the form (2.1) associated with the problem of Bolza in the 
calculus of variations. The symbols »=(n;), 7’=(n/) will denote the func- 
tions [m(x), - + -, ma(x)] and the set of their derivatives, respectively. Let 


b 
(10.1) T{n] = 20[n(a), 0(6)] + f 9, ds, 


where w and Q are quadratic forms in the 2m variables 7;, n/ and ;(a), 7:(0), 
respectively. The functional J[n] is of the form of the second variation of a 
problem of Bolza. The boundary value problem to be considered consists of 
the Euler-Lagrange differential equations and transversality conditions for 
the problem of minimizing J[n] in a class of arcs 7=7(x) which satisfy a set 
of ordinary linear differential equations of the first order 


(10.2) 0, 0') = + = 0, a=1,---,m<n, 
the linear homogeneous end conditions 
(10.3)  Wy[n(a), = Vy;son(a) + Vy = 0, y=1,---,p S 2m, 


and which render a fixed constant value to the integral 
(10.4) dee 


For the general problem of Bolza the second variation may be written as 
(10.1) if one includes in the set 7 not only the variations of the dependent 
functions in the original problem of Bolza, but also two additional functions 
representing the variations of the end values; these latter two functions are 
further restricted by including in (10.2) two additional differential equations 
which require them to be constant on ab. 


416 W.T.REID [November 


Throughout the present section the following subscripts have the ranges in- 
dicated : 4, j, k=1,--- ,n;a,B=1, v=1,--- +P; 
6,@=1,---,2n—p. Partial derivatives of w(x, n, x), Ba(x, with respect 
to the variables 7;, 7; will be denoted by writing these variables as subscripts; 
correspondingly, derivatives of Q and V, with respect to the arguments 7,(a), 
ni(b) will be denoted by Qis, Qa, respectively. 

The analysis of this section is based on the following hypotheses. 

(Hi) The coefficients of the quadratic form w(x, 7, +) and the linear forms 
®.(x, », 7) are real single-valued functions of x on ab. The functions 
Pax; are of class C', while the functions Ris = are 
continuous on this interval. Finally, the matrix I|ee,(x)] is of rank m on ab, 
the coefficients of the quadratic form Q and the linear forms V, are real con- 
stants, and the matrix | has rank p. 

(Hz) The matrix 
(10.5) | 

Dar; Ong 
is nonsingular on ab. 

An arc 7 will be termed differentially admissible if its components 7;(x) are 
of class D! on ab, and satisfy &,=0 on this interval. An arc whose end values 
at a and 5 satisfy ¥,=0 will be called terminally admissible; finally, an arc 
which is both differentially and terminally admissible will be said to be ad- 
misstble. 

(Hs) There exist p differentially admissible arcs 7;=7;,, such that the de- 
terminant | ¥,[n,(a), 7,(6)]| is different from zero. 

For arbitrary constants yu. define 


Under the hypotheses (H:), (Hz), (Hs) it follows from the theory of the prob- 
lem of Bolza that if n(x) is a minimizing arc for the above defined calculus of 
variations problem, then there exist multipliers \=constant, u.=a(x) such 
that the set [ni(x), ua(x), satisfies the differential equations 


0, — Q(x, 2, + = O, 
(10.7) 
©,(x, 0, = 0; 


moreover, there exist constants d, satisfying 
Qialn] + — 2, = 0, 
(10.8) + + |” = 0, 
¥,[n(a), n(6)] = 0. 
As (10.5) is nonsingular, the set of m-+-n equations 
(10.9) = 0,,(%,9, 7,4), = 0, 


* 
ie 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 


has unique solutions 

(10.10) = + ta = bag + maf 
Substituting these values in 0,,(x, 9, 7, u), we obtain 


where in view of the above hypotheses the functions 4%;;, B;;, €:; are continu- 
ous on ab; moreover, the matrices ||%,4| and ||€,,| are symmetric and ||®,,l| is 
of rank »—™m on this interval. Consequently, the differential equations (10.7) 
are equivalent to the system 

Liln, = nf — — = 0, 
Losiln, = — Cel + = — 
Now if ¢;=cio, ds=dio, (@=1, - - - , 2n—p), are linearly independent solutions 
of the equations V,.j. ¢;+V,,2 dj=0, (y=1,---, the boundary condi- 
tions (10.8) are equivalent to the linearly independent set 

Sy[n, = Vv, [n(a), n(b) | 0, 
Spsoln, $] = cio{Qsaln] — s(a)} + dio{Qsoln] + = 0. 
The system (10.7’), (10.8’), which is clearly of the form (2.1) in y=(:, £;), 


may be shown under the above hypotheses to satisfy conditions (i), (ii) and 
(iv) of §2 with the matrix 


(10.7’) 
(10.8) 


T= 


— 55; 03; 
For this system we have 


Ai; Bi; 0;; 0%; | Riz 055 
A= Be ’ 
Ci; — Riz 055 0:5; 055 


We now wish to consider the condition (iii)’ for such a system. The linear 
vector space L for this problem consists of sets (y, £) which satisfy with a cor- 
responding w=(w,) the system 


It is readily seen from (10.8) that if s.[n, ¢]=0, then 


nile = — 20[n]. 


Evaluating H=H[n, {] for such a set (n, £) we find 


H = +f (ni + — dx = 


417 


418 W. T. REID ‘ [November 


in view of (10.9) and (10.11). Consequently, (iii)’ for this system reduces to 
the condition that J[n]>0 for every set (n, ¢) which satisfies with an associated 
vector w the system (10.12), and for which (R:jn;) #(0;) on ab. If (n, £) is a solu- 
tion of (10.12) for a given vector w, then clearly 7 is an admissible arc. Hence 
(iii)’ is certainly satisfied if the following condition holds. 

(H,) J[n]>0 for arbitrary nonidentically vanishing admissible arcs 7. 

We thus see that a system of the form treated by Reid [9] for which, 
using the notation of that paper, the quadratic form G[n(a), (0) | is identi- 
cally zero, is H-definitely self-adjoint. 


THEOREM 10.1. Suppose that a problem of the above sort is H-definitely self- 
adjoint. If the matrix R(x) is positive [negative] definite at a point xo of ab, then 
this system has infinitely many positive [negative| characteristic values. 


On the assumption that (xo) is positive definite, there clearly exists a 
subinterval throughout which (x) remains positive defi- 
nite. Corresponding to an arbitrary subinterval ajb{ of a,b, denote by 
t=(f:,), (o=1,-+-, 2+1), a set of m+1 vectors whose components are of 
class C! on ab, are identically zero outside the subinterval a{ bj, and such 
that the vectors (e=1, - - , #+1), are linearly independent on aj 
Such vectors clearly exist since 8 is of rank n—m on ab; in fact, all that is nec- 
essary to insure the existence of such vectors is that 8 40 on aj bj . Now define 
n, as the solution of nj, =0. Clearly the vec- 
tors 7, are linearly independent on ab; moreover, there exist constants 
(di, - , not all zero such that if we set 7= dit +7 then 
ni(b)=0. This vector satisfies with +€n4idn41 the system 
ni =Ainjyt+ Bist; on ab; furthermore, as [;=0; outside aj b{, it follows from 
the conditions 7;(a)=0=7;(b) that »;=0; outside this subinterval. Since 
is nonsingular on aj bj there clearly exists a corresponding w such that the 
differential equations of (10.12) are satisfied by (n, £, w); the boundary con- 
ditions are also satisfied by (n, £) since this set vanishes at x=a and x=). 
Consequently, since on a; b{ we have that #0 and the matrix § is positive 
definite, while 7 =0 outside this subinterval, it follows that the thus deter- 
mined solution y=(n, £) of (10.12) satisfies the conditions described in hy- 
pothesis (v,) of §9. Hence by Theorem 9.4 the considered system has 
infinitely many positive characteristic values. The corresponding result for 
negative characteristic values is readily deducible from the above by consider- 
ing the related boundary value problem obtained by replacing the matrix 
R(x) by — R(x). 

11. A particular differential system. Krein [7] and Kamke [6] have con- 
sidered a self-adjoint boundary problem of the form 


(11.1) = rk(x)u, U.[u] = 0, o=1,---,2n, 


where ([u] is a differential operator of the form 


‘ 
| i 
4 


SELF-ADJOINT BOUNDARY PROBLEMS 


= 


1,(x) 0 and 1,(x), (v=1,---, ), of class on ab, while the U,[u] are in- 
dependent linear forms in the end values of u, u’,---, u@*-» at x=a and 
x=b. Each of these authors has assumed that the functional 


(11.2) en dx 


possesses certain properties of definiteness. 

Krein has supposed that (11.2) is non-negative for every function u which 
is of class C‘**) on ab and such that u, u’,---, u°*-" all vanish at a and d; 
moreover, that the continuous function k(x) occurring in (11.1) is non-nega- 
tive throughout ab. Kamke [6, I] has assumed that (11.2) is non-negative 
for every function u of class C°*) which satisfies the boundary conditions 
U,{u]=0; moreover, that \=0 is not a characteristic value of (11.1). In ad- 
dition, Kamke has also treated the case in which the continuous function 
k(x) changes sign on ab. 

It will now be shown that a system (11.1) may be written as one of the 
type considered in the preceding section. Now ([u] is the Euler expression 
for the integral 


(11.3) { x) 14? — + (— (x) }2} dx. 


This integral, by a device familiar in the calculus of variations, is equivalent 
under the substitution 71: to the integral 


2 2 2 n 
(11.4) — + +(— 1) m+ (— } dx, 
together with the auxiliary linear differential equations 
(11.5) = 2 — Nati = 0, a=1,---,n—1, 


Suppose u is of class C*) and satisfies the nonhomogeneous differential 
equation 


(11.6) Lu] + f(x) = 0. 

If we set 


i=1,---,n, 


(11.7) 


it is readily seen that (n:, {;) satisfy the first order system 


1942] 419 


W. T. REID 


Na = Nat+ly 
(11.8) * 


oi = lo(x)m + f(x), 
Site (- = far a=1,--- 1. 


Conversely, if (i, §:) is a solution of (11.8) it follows that u=m satisfies 
(11.6); moreover, u and its first 2n—1 derivatives are related to (ni, £:) by 
the equations (11.7). Hence there is complete equivalence between the single 
linear equation (11.6) of order 2m and the system (11.8) of 2m linear differ- 
ential equations of the first order. For f(x) =0 this latter system is the canoni- 
cal form of the Euler-Lagrange equations for the integral (11.4) subject to 
the auxiliary differential equations (11.5). 

Now the U,[u] are supposed to be 2n independent linear forms in the 
end values of u, u’,- ++, u@*-) at x=a and x=5. In view of the assumption 
that /,,(x) #0 on ab it follows that they may equally well be considered as in- 
dependent linear forms in the end values of the corresponding 7;, {; at a and 5; 
consequently, the set U,[u]=0 may be written as 


(11.9) soln, = — (a) + + = 0, 
: o=1,---+,2n. 


If u and u* are of class C“” on ab, it follows from the self-adjoint character 
of C[u] that (see, for example [5, p. 123]) 


where P(u; u*) is bilinear in the sets (u, u’,---, u'?*-") and (u*, u*’,---, 
u*(2m—-1)) and is the so-called bilinear concomitant. In particular, if (n:, {:) and 
(n#*, ¢*) are defined by (11.7) for u and u*, respectively, it may be readily 
verified that 


P(u; u*) = — 


The self-adjoint character of the boundary conditions implies that for arbitrary 
functions u, u* whose end values satisfy U,[u]=0=U,[u*], (o=1, ---, 2m), 
we have P(u; u*)|?=0. Consequently, if ¢;) and (n#, are arbitrary sets 
of functions satisfying s.[n, ¢]=0=s.[n*, ¢*] we must have 


(11.10) P(x) — |? = 0. 


Now the general solution of s,[n, {]=0, (c=1,-+-, 2m), is of the form 


420 [November 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 


= (a), = i 
2 
ni(b) = $b) = — &d,;, 


where (cj, dj, cj, dj)=(ch,, di, dy), (r=1,-++, 2m), are 2m linearly inde- 
pendent solutions of the system 


(11.11) 


id + iC; be id; = 0, o=1,---, 2m. 
Corresponding to = (é,), &* =(§*), determine the end values of (n:, and 
(n¥, £*) by equations (11.11). Because of the arbitrariness of £ and &* relation 
(11.10) then implies 
(11.12) — Cashes + — Cases = 0, 


whence it follows that there is a nonsingular matrix || Z,,|| such that 


1 1 1 1 2 2 2 > 


Consequently, writing £,E,,=e,, a set (ni, t) is seen to satisfy (11.9) if and 
only if there are constants (e,) such that 
1 1 
2 . 2 
ni(b) = = — 
Either from (11.12), or from substitution of (11.11’) in (11.10), it follows that 
the 2m X2n matrix : 
(11.13) || = + 


is symmetric. Now if the 2n X2n matrix ||b}, 53,|| is of rank 2n—p, denote by 
r=(r,)=(r-), (Y=1, +--+, p), a set of p linearly independent solutions of the 
equations 


(11.11’) 


= 0, = 0, 


If (i, £s) satisfies (11.9), then 7;(a), 7;(b) must satisfy 
(11.14) Vy = Vy; + Vy; = 0, 


where Vy. =1ye02,. Since s,[n, are independent linear forms, 
the p conditions (11.14) are seen to be also linearly independent. 

If p=2n, the problem (11.1) is then seen to be equivalent to one of the 
sort studied in §10 with 2w defined as the integrand of (11.4), the auxiliary 
differential equations &,=0 and boundary conditions determined by (11.5) 
and (11.14), respectively, the quadratic form Q=0, while the matrix R(x) is 
defined as 

0. Oas 


422 W. T. REID . [November 


In general, it is to be noted that 6),V,.j.+02,V,.0=0, (e=1,---, 2n; 
vy=1,---, p), and since the linear forms of (11.14) are independent and 
||03 ,02,|| is of rank 2n—p it follows that if +--+, 2n), 
then there must exist constants d, such that wj.=d,W,.ja, we=d,V,,p, 
(j=1,---, m). Moreover, for (e,) related to the end values of (n:, ¢;) by 
(11.11’) we have 


(11.16) ae + = Keres. 


Since r=(r.), (y=1, , p), satisfies =0, the rank of ||z,,|| does not ex- 
ceed 2n—p. If this matrix is of rank 2n—g, denote by p=(p,)=(p,,), 
(v=1, - - -,g),setsorthonormal in the sense that p,,pxr = 5x, (v,x=1, +--+, 9), 
and satisfying k,,p,=0, (0=1,---+-, 2m). Then the matrix 

her Pxe 
Por Orx 


is nonsingular, and its reciprocal is a symmetric matrix of the form 


hes Pxe 
Por Orx 


If now (ni, §%) satisfies (11.9), and (e,) is determined by (11.11’), it follows 
from (11.16) that there exist constants ¢,, (v=1, - - - , g), such that 


(11. 17) = Nex + a; ;(b) + 

Writing 

(11.18) [n] = [asns(a) + la, + a; 

it then follows from (11.11’) and (11.17) that 

Qia [n] + Ors 0, 

Qisln] + Ors + = 0, #=1,---,m. 


On the other hand, since 


(11.19) 


1 1 2 2 
be i) + be = (por kre) = 0, 


there exist constants d, such that ja, It thus 
follows that (11.1) is equivalent to a boundary value problem of the type 
treated in the preceding section with 2w defined as the integrand of (11.4), 
the auxiliary differential equations and end conditions defined by (11.5) and 
(11.14), respectively, the quadratic form Q of (11.18), and &(x) given in 
(11.15). 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 423 


Finally, if uw is of class C°” on ab and (n:, £:) are defined by (11.7), it is 
readily seen that u.C[u]=2w(x, 7, £5)’ and 


b 
f ule dx =f 2w(x, 0, n') dx — = f 2w(x, 0, dx + 20[n} 


whenever U,[u]=0. In particular, the hypotheses of Kamke on (11.1) are 
seen to imply that the above defined equivalent problem is H-definitely self- 
adjoint in the sense of §2. Actually, a self-adjoint system (11.1) is H-definitely 
self-adjoint if \=0 is not a characteristic value, and the functional (11.2) is 
non-negative for arbitrary functions u of class C°” satisfying with a continu- 
ous function g(x) the system L[u]=k(x)g(x), U.[u] =0. In case k(x) vanishes 
or changes sign on abd this condition is slightly weaker than that used by 
Kamke. 

In conclusion, it is to be remarked that once the symmetry of ||&,,|| is es- 
tablished, the existence of linear forms VY, and a quadratic form Q such that 
the boundary conditions s,[n, ¢] =0 reduce to (11.14), (11.19) has been proved 
by Hu [4, pp. 380-382]. The above presentation, however, determines more 
explicitly the form of the Y, and Q in terms of the coefficients of the forms 
Se [n, 

12. H-definitely self-conjugate adjoint systems. In the preceding sections 
we have been concerned with a system (2.1) involving real-valued coefficients. 
However, the notion of H-definite self-adjointness may be extended to a sys- 
tem (2.1) whose coefficients are complex-valued in a manner previously pre- 
sented by Reid [11] for extending the notion of definite self-adjointness to 
such a system. 

In the following we shall therefore suppose that the elements of A(x), 
B(x) are complex-valued continuous functions of the real variable x on ab, 
and that the coefficient matrices M and N of the linearly independent bound- 
ary conditions s;[y]=0 have complex-valued elements. If K =||K;,,||, then we 
shall denote by K the matrix ||K;,| whose elements are the complex con- 
jugates of the corresponding elements of K; moreover, K* shall denote the 
conjugate transpose matrix ||K;,||. As in Reid [11] we shall also consider the 
system 


(12.1) = —duB,~ ilu] = u(a)P + u(d)0 = 0, 


where P and Q are the matrices occurring in the boundary conditions of the 
adjoint system (2.2). System (12.1) is termed the conjugate adjoint of (2.1). 
The system (2.1) is said to be self-conjugate adjoint with the matrix T if it is 
equivalent to (12.1) under the transformation u = T7T(x)y, where the elements 
of T(x) are complex-valued functions which are of class C' on ab, and T is non- 
singular on this interval. It follows (Reid [11, Theorem 2.1]) that (2.1) is 
self-conjugate adjoint with T(x) if and only if 


424 W. T. REID 


TA + A*T + T’ =0, TB + B*T=0 on ab, 
MT-\(a)M* = NI-(b)N*. 


We shall now say that (2.1) is H-definitely self-conjugate adjoint with the 
matrix T, or merely H-definitely self-conjugate adjoint if: 

(i) The system is self-conjugate adjoint with T. 

(ii) The matrix S(x) =7*(x)B(x) is hermitian. ‘ 

(iii) If the linear vector space L be defined as in §2, with the understand- 


ing now that the components of y and g are complex-valued, then the func- 
tional 


(12.2) 


aly] = f 9 ae, 


which is readily seen to be real-valued on this space L, is positive for arbitrary 
vectors y of L such that By #0 on ab. 

(iv) There exists no nonidentically vanishing solution y of <[y]=0, 
s[y]=0 such that By=0 on ab. 


THEOREM 12.1. All the characteristic values of an H-definitely self-conjugate 
adjoint system (2.1) are real. 


For if y were a characteristic solution of an H-definitely self-conjugate 
adjoint system corresponding to a non-real characteristic value \, it would 
follow as in the proof of Theorem 3.1 of Reid [11] that [29Sy dx=0. Since 
for such a characteristic solution we have H[y]=/29Sy dx, it then ensues 
that H[y]=0. Because of the above condition (iii) it would then follow that 
By=0 on ab, which is impossible for a characteristic solution by condition 
(iv). Hence all the characteristic values of such a system are real. 

Once this result is obtained, the consideration of the existence of charac- 
teristic values and related expansion theorems for an H-definitely self-con- 
jugate adjoint system (2.1) is reducible to the same consideration for an 
associated H-definitely self-adjoint system with real coefficients. Since this 
reduction is attained by the same device of separating real and pure imagi- 
nary parts of (2.1) for real values of \ as used in Reid [11], the details of the 
reduction will be left to the reader. 

In a general discussion of boundary value problems one might very well 
start with a system of the form (2.1) whose coefficients are complex-valued, 
which satisfies the above conditions (i), (ii), (iv) and the following alternative 
to the above condition (iii): 

(iii)* If the linear vector space L be defined as in §2, with the understand- 
ing that the components of y and g are complex-valued, then there exist real 
constants a and 8 not both zero and such that the functional 


b 
(12.3) f 9 T*(aL[y] + BBy) dx, 


[November 


1942] SELF-ADJOINT BOUNDARY PROBLEMS 425 


which is readily seen to be real on L, is positive for arbitrary vectors y of L 
such that By £0 on ab. 

If a system (2.1) satisfies (i), (ii), (iii)* and (iv), and has a0 in (12.3), 
this system may be reduced to an H-definitely self-conjugate adjoint system 
by a linear change of parameter and the possibly needed change of replacing T 
by —T. If for such a system we have a=0, then the system obtained is some- 
what more general than a definitely self-conjugate adjoint system; for such 
a problem, however, one is still able by the usual method of proof to establish 
the reality of characteristic values, the equality of index and multiplicity of 
its characteristic values, and a completeness property of the totality of char- 
acteristic solutions similar to that proved by Bliss for definitely self-adjoint 
systems (see Bliss [2, Theorem 2.3 and its Corollaries]). In a recent course 
on boundary value problems the author has followed this order of presenta- 
tion. For the purpose of publication of new results, however, the above sepa- 
rate treatment of H-definitely self-adjoint systems seems desirable, since by 
this procedure one is able on various occasions to utilize readily certain results 
that have previously been established by Bliss and the author. 


BIBLIOGRAPHY 


1. G. A. Bliss, A boundary value problem for a system of ordinary linear differential equations 
of the first order, these Transactions, vol. 28 (1926), pp. 561-589. 

2 , Definitely self-adjoint boundary value problems, these Transactions, vol. 44 
(1938), pp. 413-428. 

3. E. Goursat, Cours d’Analyse Mathématique, vol. 3, Paris, 1927. 

4. K. S. Hu, The problem of Bolza and its accessory boundary value problem, Contributions 
to the Calculus of Variations, 1931-1932, The University of Chicago Press, pp. 361-443. 

5. E. L. Ince, Ordinary Differential Equations, 1927. 

6. E. Kamke, Uber die definiten selbstadjungierten Eigenwertaufgaben bei gewéhnlichen 
linearen Differentialgleichungen. 1, Mathematische Zeitschrift, vol. 45 (1939), pp. 759-787; 
II, ibid., vol. 46 (1940), pp. 231-250; III, ibid., vol. 46 (1940), pp. 251-286. 

7. M. Krein, Sur les opérateurs différentiels autoadjoints et leurs fonctions de Green symé- 
triques, Recueil Mathématique, vol. 2 (44) (1937), pp. 1023-1070. 

8. J. Mercer, Functions of positive and negative type, and the connection with the theory of 
integral equations, Philosophical Transactions of the Royal Society, vol. 209 A (1909), pp. 415- 
446 


9. W. T. Reid, A boundary value problem associated with the calculus of variations, American 
Journal of Mathematics, vol. 54 (1932), pp. 769-790. 

10. _ A system of ordinary linear differential equations with two-point boundary con- 
ditions, these Transactions, vol. 44 (1938), pp. 508-521. 

11. , Some remarks on linear differential systems, Bulletin of the American Mathe- 
matical Society, vol. 45 (1939), pp. 414-419. 


UnIvERsITy oF CHICAGO, 
Cuicaco, ILL. 


BOUNDED UNIVALENT FUNCTIONS 


BY 
RAPHAEL M. ROBINSON 


1. Introduction. We shall consider the class of functions f(z) which are 
regular and univalent for |z| <1, with |f(z)| <1 there, and with f(0) =0. For 
any fixed z90 in the unit circle, we use the abbreviations 


(1) a=|f'(0)|,° b=|a|, c=|fleo)|, =| 


We are concerned in this paper with the inequalities relating a, b, c, d. It may 
be noted that these relations are not affected if we impose the condition 
f'(0) >0, which we shall do. 

The four quantities (1) are restricted individually only by 


(2) 0<asil, 0<d<1, 0<c<l, 0<d. 


If a=1, then f(z) =z, hence d=1 and c=). This trivial case will be excluded 
below where convenient. If a has any other given value, then it is easily seen 
that no restriction is placed on any one of the other quantities. Between 6 
and ¢ the only relation is c=); the equality c=6 holds only if a=1. The rela- 
tions among the quantities of each of the sets (b, d), (c, d), (a, b, c), (b, c, d), and 
(a, b, c, d) are considered in §5. The relations between (a, b, d) are considered in 
§6, and those between (a, c, d) in §7. Thus all subsets of the four quantities 
are considered. It should be pointed out that the determination of the in- 
equalities satisfied by the four quantities by no means completes the solution, 
sitice one of the main difficulties is that of eliminating one of the quantities 
in,order to find the inequalities satisfied by three of them. 

All of the inequalities which we obtain will be sharp; that is, in each case 
there is an extremal function for which the inequality becomes an equality. 
But in general we shall not go beyond the mere existence of such an extremal 
function. 

Finally, there is an appendix (§8) on unbounded univalent functions. Let 
F(z) be regular and univalent for \z <1, and suppose F(0)=0, F’(0)=1. The 
relations between | zo| ‘ | F(z) | , and F'(20)| are discussed in detail. One result, 
which may seem surprising, will be mentioned here: Jf | F(zo)| 31/4 then 


| F’(o)| S$ 1+ 3-2-8? = 2.06---, 


but if | F(zo)| has a prescribed value greater than 1/4, no upper bound for 
| F’(z0)| can be given. The results in the appendix are obtained as limiting 
cases of results for bounded univalent functions, but without using §6 and §7. 


Presented to the Society, November 22, 1941; received by the editors November 13, 1941. 
426 


BOUNDED UNIVALENT FUNCTIONS 427 


2. The method of Léwner. If, from our class of bounded univalent func- 
tions, a subclass is chosen by means of which any function of the given class 
can be uniformly approximated in the interior of the unit circle, then the in- 
equalities between any of the quantities a, b, c, d are the same for the subclass 
as for the whole class, except perhaps as regards the possibility of equality 
signs holding. 

According to Léwner('), we may choose the subclass as the class of func- 
tions f(z) to which a function f(z, ¢) can be found, with the following proper- 
ties. There is a number J>0 such that f(z, ¢) is continuous for | z| <1 and 


0s¢3/, and is a regular function of z for each fixed ¢. The boundary condi- 
tions 


(3) fi fie D= ss) 


are satisfied, so that f(z) may be regarded as having been obtained from the 
identity by continuous variation. The rate at which this variation takes place 
is governed by 


(4) 


where the prime denotes differentiation with respect to z. Finally, there is a 


continuous function x(t) with | x(t) =1, such that f(z, ¢) satisfies the differ- 
ential equation 


Of(z, t 
at 1 — x(é)f(z, 

It is also permissible, and more convenient for us, to allow x(t) to be a 
piece-wise continuous function; we then understand that (5) is to hold except 
at the points of discontinuity of x(t), and similarly below. This weaker condi- 
tion on x(t) means that we are choosing a larger subclass from the class of 
bounded univalent functions. The advantage of this is that extremal functions 
for all of our inequalities are then brought within the-subclass. 

Another fact which is important for us is that (5) can be solved for any 
given x(#) satisfying the conditions mentioned, so that there is a one-to-one 
correspondence between such functions «(¢) and functions f(z) of our subclass. 

From (3) and (4) we see that 


(6) 
3. The integrals J and J. From (5) we readily obtain 
1 — | f(zo, |? 


d 
(7) a | )| = — [1 = «foo, 


(*) K. Léwner, Untersuchungen tiber schlichte konforme Abbildungen des Einheitskreises, 
Mathematische Annalen, vol. 89 (1923), pp. 103-121. 


428 R. M. ROBINSON ~ [November 


If we put 
(8) s =| f(z, 


then s decreases from 6 to ¢ while ¢ increases from 0 to J. Any function of ¢ 
in the interval 0S¢<J may be regarded as a function of s in the interval 
csssb. In particular, we put 


(9) x(t) f(zo, #) = n(s)s, 


so that n(s) is a piece-wise continuous function with | n(s)| =1. Then (7) 
takes the form 


ds 
(10) =, 
dt H(s) 
where 
1 ie 2 
(11) 
1 — s? 
We have evidently 
1-s i+s 
12 —— s H(s) s —- 


We show now that n(s) is an arbitrary piece-wise continuous function 
with | n(s)| =1. If any such 7(s) is given in the interval cSs <b, then we first 
determine H(s) from (11), and then from (10) and the fact that ¢=0 for s =5, 
we find that 


t= 
and in particular that 

(14) I= f H(s)ds/s. 
Now (13) determines s as a function of ¢ in the interval 0S¢S/J, so that (8) 
and (9) determine | S (Zo, t)| and x(t)f(zo, ¢). From (5) it follows that 
23 [x(Af (Zo, t)] 

| 1 — x(4)f(zo, #) |? 


Using the value just found for «(¢)f(ze, t), we can find amp f(2o, ¢), and hence 
f (Zo, t) itself, the initial condition f(zo, 0) =z» being imposed. Finally, knowing 
x(t)f(zo, t) and f(zo, ¢), we can find x(t) by division; it will be piece-wise con- 
tinuous and satisfy | x(t)| = 1. Now the and ¢) determined satisfy (7) 


d 
(15) = amp f(zo, #) = 


1942] BOUNDED UNIVALENT FUNCTIONS 429 


and (15), which are equivalent to (5) with z=z». From this we see that the 
x(t) which we have found will in fact lead back to the desired n(s). 

From the fact that »(s) is arbitrary, we see that H(s) is an arbitrary piece- 
wise continuous function satisfying (12). For if any such H(s) is given, we 
can find an »(s) satisfying (11). 

It is readily seen that if f(z, ¢) is a continuous function which is regular 
in z for each fixed ¢, and such that f,(z, ¢) is continuous, then f,,(z, ¢) and fi.(z, t) 
exist and are equal. Using this fact, from (5) we find that 


| 1 — «fo, |* 

Expressing the right side in terms of s, and using (10), we have © 

|1—n(s)s|? dt 


6 | |=1—2 
(16) f' (20, )| = 


d 1 
— log | (zu, | = n(s)s |? —2 


dt s(1 — s*) 
Integrating from to t=I gives 


(17) ‘log d= f | 1 = - 


Now 
1 — s?+ | 1 — n(s)s|? 
2| 1 — n(s)s| 


so that the numerator of the integrand becomes 2s?—(1—s*)/H(s), and (17) 
takes the form 


cos amp [1 — 9(s)s] = 


(18) 


where 


(19) f (4/H(s))ds/s. 


The problem of finding what values are possible for a and d when b and ¢ 
are given is thus reduced to finding the relations between the integrals I 
and J. The corresponding values of a and d are then found from (6) and (18). 

4. Relations between J and J. Suppose 0<c<b<1, and let J and J be 
defined by (14) and (19), where H(s) is any function satisfying (12) and piece- 
wise continuous in the interval cSs <b. In this section, we shall determine 
the inequalities relating J and J. It is clear that the relation between J and J 
is a symmetric one. 

We start by introducing two functions which we shall need in this dis- 
cussion. For Sr 3, let 


= 

1 — 5? 


430 R. M. ROBINSON ~* [November 
(rb, 
r;b,c) = — 
P ce i+s s r it+rs 


d. d. 
= f +f 


s 1-rs 


(20) 


Evaluating the integrals gives 


= log | ] + tg =, 
(21) (1+ 7)? (1+ 0)? 1+r r 
8, = log | : =; 
(1—r)? (1 —)? 1-—r r 
in particular we have 
(22) (1+ 5)? +c)? 1+c 
b i+c b 
q(b; b, c) = log la q(c; = log . 


It is clear from (20) that p(r; b, c) is a decreasing function of r, and g(r; 5, c) 
an increasing function. Since p(r; b, c)<q(r; b, c), we have in particular 


(23) p(b; b, c) < plc; b, c) < g(c; b, c) < g(b; 


It is clear first of all that individually J and J are restricted only by the 
conditions 


(24) p(b; b,c) S$ I S g(b; 5, 0), 
(25) b,c) SJ g(d; 5, 


To find the largest possible value of J when IJ is given, we note that 
H(s)+1/H(s) S$(1—s)/(1+s)+(1+s)/(1—s), and hence 


(26) I+J p(b; c) + 5, c). 


The equality is attained for any piece-wise continuous function H(s) which is 
equal to (1—s)/(1+s) in some subintervals, and to (1+5s)/(1—s) in others. 
Since these are the values of H(s) which give J its smallest and largest values, 
we see that any possible value of J can be obtained in this way. Hence for 
any given J, the largest possible J is determined from (26). 

It remains to find the smallest possible J for a given J. Let k be any posi- 
tive constant, and consider 


1942] BOUNDED UNIVALENT FUNCTIONS 431 


The integrand is smallest when H(s)=k; but this may not be compatible 
with (12). It is clear that the integral will be minimized if we keep H(s) as near 
to k as possible. If & is sufficiently small, then this H(s) is always equal to 
(1—s)/(1+s); and if & is sufficiently large, then H(s) =(1+s)/(1—s). Hence 
for a suitable value of k, J may be given any possible value. Thus the mini- 
mum J for any given I is obtained for H(s) as near to some constant k as 
possible. 
To formulate the lower bound for J, we distinguish three cases: 


Case p. p(b; b,c) SIS p(c; b, 0). 
(28) Caseo. p(c; b,c) SIS q(c; 6, 0). 
Case g. g(c; b,c) SIS q(b; 5, 0). 


We note first that p(r; 6, c) is given by (14) where H(s) is as near to 
(1—r)/(1+1) as possible. Hence in Case p, we determine r so that p(r;b, c) =I, 
and then J24q(r; 5, c). Similarly, in Case g, J2 p(r; 6, c), where g(r; 6, c) =I. 
In Case 0, H(s) may be equal to a constant k throughout the interval, and 
hence J is minimized in this way. Hence for =: log b/c, the minimum J * 
is (1/k) log b/c, so that J2(log,b/c)?/I. The three cases may be combined 
in the form 


(29) J=LU;b,0 
where 
F q(r; 6, c), where p(r; b, c) = I, in Case , 
(30) L(I; b, c) = § (log b/c)*/I, in Case 0, 
P(r; 6, c), where g(r; 6, c) = I, in Case g. 


5. The simpler cases of the problem. In this section we obtain the inequal- 
ities among each set of quantities chosen from (a, b, c, d), except for the trivial 
cases treated in the introduction, and the cases (a, b, d) and (a, c, d), which 
have separate sections devoted to them. However, partial results for those 
two cases are given here. 

Relations between a, b, c. These were first obtained by Pick(?); more re- 
cently, Golusin(*) derived them, using the method of Léwner. From (24) we 
have, for c <b, 


c b c b 
: Sas . 
(1 —c)? (1 — (1+ c)? (1+ 6)? 
(2) G. Pick, Uber die konforme Abbildung eines Kreises auf ein schlichtes und sugleich be- 
schranktes Gebiet, Sitzungsberichte Akademie der Wissenschaften, Vienna, vol. 126 (1917), pp. 
247-263. 
(8) G. M. Golusin, Uber die Verzerrungssdtze der schlichten konformen Abbildungen (Russian 


with German summary), Matematicheskii Sbornik (Recueil Mathématique), vol. 43 (1936), 
pp. 127-135. , 


(31) 


432 R. M. ROBINSON ~ [November 


For c=), (31) gives a=1, which is correct. Thus for any given 6 and c, with 
c Sb, the value of a is restricted only by (31). It may be noted that (31) itself 
implies that c $5, so that it is in fact the only relation between a, b, c. 

The bounds in (31) are attained for the function w=f(z) defined for |z| <1 


by 
w az 
(1—w)? (1—2)?’ 


This function maps |z| <1 on |w| <1 with a slit along the negative real axis. 
The lower and upper bounds are attained for 2) >0 and 29<0, respectively. 
Relations between b, c, d. From (25) we have 
(1 —c)? (1 —)? 1—c~ (1+)? (1+)? 
This inequality could also be obtained from (31) by making linear trans- 
formations of |z| <1 and | w| <1 into themselves, in such a way that 0 and zo 
are interchanged in the z-plane, and 0 and f(z) in the w-plane. From (33) 
we find that 


(32) | w| <1. 


(33) 


_ - 
1—c 1— bd 1+ 


This is the only relation between 5, c, d. The bounds are attained for the func- 
tion (32), for 29<0 and 29 >0, respectively. 

Relations between } and d. This case may be solved by seeing what 
bounds are given for d by (34) when d is given but c is not. The only restric- 
tion on c is that 0<c Sb. Letting c-+0, we see that there is no positive lower 
bound for d. On the other hand, the right side of (34) is bounded, and in fact 
has its largest value for c=2'/?—1; if this is not within the allowed interval, 
then the largest possible value is at c=). Frorn this we find that 


1 if 0<bs 2)? 


(35) d = — 93/2 1 + b ; S78 a» 
(3 if 2 isb<1. 

This is the only relation between b and d. The equality sign holds for the 
identity in the first case, and for the function (32) with a chosen so that 
c=21/2—1 in the second. Dieudonné(*) has shown that the first part of (35) 
holds for bounded functions which are not supposed univalent. 

Relations between c and d. Letting b—>1 in (34), we see that there is no 
restriction on the value of d if only c is given. 


(34) 


(*) J. Dieudonné, Recherches sur quelques problémes relatifs aux polynémes et aux fonctions 
bornées d'une variable complexe, Annales de |’Ecole Normale, vol. (3) 48 (1931), p. 352. 


4 


1942} BOUNDED UNIVALENT FUNCTIONS 433 


Relations between a, b, d (partial results). From (31) and (34) it is possible 
to find the smallest value of d for given a and b, and in some cases also the 
largest d. 

If a and 6 are given, we may determine the smallest possible value of c 
from the right side of (31), and then the smallest possible d for this c and 
the given } from the left side of (34). Since the two functions of c involved in 
the bounds are increasing, and since the equality signs in the two cases are 
attained together, we obtain in this way the best lower bound for d in terms 
of a and b: 


de + + 0) 


(36) 


where c is determined from 
ab 
(1+)? (1+6)? 


The equality in (36) is attained for (32) with 2» <0. 

Similarly, the equality signs on the left side of (31) and the right side of 
(34) are attained together. But the function c(1—c)/(1+c), which occurs on 
the right side of (34), is increasing only for c$2'/?—1, so that we can draw. 
the conclusion 


(37) 


c(1—c) — d) 


38 d 
(38) 1+5 


’ 


where c is determined from 
ab 


(39) 


only if that value of c is not greater than 2/?—1. In this case, the equality 
will be attained for (32) with 29>0. It is clear that we obtain in this way the 
best upper bound for d in terms of a and b if 6 =2"/?—1. We shall show in §6 
that the same is true whenever 31/2 and in some other cases, but not in 
all cases. 

Relations between a, c, d (partial results). We try to find bounds for d in 
terms of a and c from (31) and (34). If 


4c 

(1 + c)? 

then the right side of (31) determines a largest value possible for 5, and then 
with this 6 and the given c, a lower bound for d is determined from the left 


side of (34). This bound is the best possible. It is given by (36) with 5 deter- 
mined from (37), and is attained for (32) with zo.<0. On the other hand, if 


(40) 


434 R. M. ROBINSON ~ [November 


(40) is not satisfied, then 6 is permitted values arbitrarily close to 1. It is 
easily shown that no positive lower bound exists for d in this case. For ex- 
ample, it is sufficient to consider the functions w=f(z) which map || <1 on 
|w| <1 with a slit from —1 nearly to —c and a slit on the positive real axis 
long enough to give a the proper value, 29 being chosen so that f(zo) = —c. 

We can draw the conclusion that an upper bound for d is given by (38) 
with 6 determined from (39) only if b(1—6)/(1+) has its smallest value 
when 6 has its smallest possible value; if this is true, then the maximum 
value of d is attained for (32) with 29 >0. The condition is certainly satisfied 
if for the given values of a and c one has necessarily 6 S$ 2'/?—1; this is true at 
least for a near 1 and c near 0. More generally, if we denote the smallest pos- 
sible value of b by bmin, and suppose that b has a largest possible value dmax, 
then the condition is satisfied if b(1—6)/(1+5) is not larger at bmin than at 
bmax- This condition reduces to 


(1 + Dmin) (1 + bmax) s 2, 


which is the best result obtainable by the present method. We shall show in §7 
that the conclusion holds if and only if 


(41) s 1/2, bmax 1, 


where the second inequality is to be interpreted to mean: either a and c have 


such values that bmax <1, or are limits of such values. 

Remark on the hyperbolic expansion factor. We may interpret the expres- 
sion d(1—b*)/(1—c*), which occurs in (33), as the expansion factor for 
the mapping w=f(z), when the metric of hyperbolic geometry is introduced in 
|z| <1 and |w| <1. By a similar argument to that used above, we find that 
no matter which two of the three quantities a, b, c are given, the hyperbolic expan- 
sion is minimized for (32) with 29<0, and maximized for (32) with 2o>0. Only 
in case a and ¢ are given, not satisfying (40), and we are seeking to minimize 
the hyperbolic expansion, is it impossible to satisfy the necessary conditions, 
2o<0 and |f(z0)| =c. But in this case we know that d has no positive lower 
bound, and a fortiori the same is true of the hyperbolic expansion. It may also 
be noted that the conclusion that the hyperbolic expansion has its extreme 
values for (32) is weaker than the same conclusion about d, and hence follows 
from this when this is true. The bounds for the hyperbolic expansion when a 
and b are given were found by Pick(*). 

So far in this section, we have used from §4 only the trivial results (24) 
and (25); the rest of the section is used first in considering the relation be- 
tween all four quantities. It may also be noted that the results of this section 
so far have depended only on (31) and (33). Since (33) can be deduced from 
(31), these results can be obtained without using Léwner’s method, if we as- 


(*) Pick, loc. cit. 


1942] BOUNDED UNIVALENT FUNCTIONS 435 


sume (31) from the work of Pick. The rest of this section, and the next two 
‘ sections, depend essentially on Léwner’s method. 

Relations between a, b, c, d. On account of (18), we see that the lower 
bound for d in terms of a, b, c is found from the upper bound for J, given in 
(26). This leads to 


(42) de 1 b? 
It may be verified that this bound is attained for any function w=f(z) 
mapping |2| <1 on |w|.<1 with slits along the positive and negative real 
axes; the equality sign in (42) then holds for any positive or negative zo. The 
lengths of the slits and the value of z9 may be so chosen as to give any desired 
values to a, b, c. 

Similarly, the upper bound for d is found from the lower bound for J, given 
by (29). The result may be written in the form 


(43) log d S M(I; b, o), 


where 


1— 


(44) M(I; b,c) = log ——— — b, 0). 


1— 


6. Relations between a, b, d. We shall obtain the lower and upper bounds 
for d in terms of a and b by eliminating c from (42) and from (43); this will 
complete the partial solution given in §5. 

The lower bound for d in terms of a and b may be obtained from (42) by 
substituting the smallest possible value of c, which is obtained from (37). 
This is seen to agree with our previous result, which was (36) with the same 
value of c substituted. 

We turn now to the problem of finding the upper bound for d in terms 
of a and b. We have to maximize M(J; b, c) for all possible values of c. Now 
(44) defines M(I; b, c) in terms of L(J; b, c), which in turn is defined by (30). 
In (30), different formulas hold in each of the three cases (28). The cases are 
distinguished according to the interval in which J lies when b and ¢ are given. 
But now we wish to consider J and 6 as given, and see in what intervals c 
must lie in order that each of the cases may hold. 

We note first that all four functions b, c), p(c; b, c), g(c; b,c), q(b; b, ¢) 
are decreasing functions of c. This is evident from (22) for all the functions 
but q(c; 5, c). For fixed let =q(c; b, c); then 

b 2 b 


If we put ¥(c) =log c+(1—c*)/2c, then the condition ’(c) <0 reduces to the 


436 R. M. ROBINSON ~ [November 


form ¥(c) >log b. Now w’(c) = —(1—c)?/2c4 <0, so that ¥(c) is decreasing. 
Hence >log as was to be shown. 

Thus each of the four functions is strictly monotone, and from (22) it is 
seen that each decreases from + © to 0 as c increases from 0 to b. Hence if a 
and 0 are given, with a<1, there are unique numbers Cy, €»p, €,, Cg, between 0 
and 5, which satisfy 


(45) b, Cy) = P(Ep; b, Ep) = = 5, cg) = I. 
From (23) it is seen that these numbers satisfy the inequalities 


and (24) takes the form c,<c Sc,. The three cases (28) are equivalent to the 
following: 
Case p. é,. 


(47) Caseo. 
Caseg. & SCS. 


We now calculate L,(J; 5, c) in each of the three cases. In the first place, 
we have 


b 
b, c) = log — b, c) = — ————- 


—2 
(1+ 1)? 


(48) 


2 b 
(7; b,c) = log —, (7;b6,¢c) = — 


From these and the definition (30), we have 
4 1—c 


1—r/ c(i+c) where p(r; b, c)=I (Case 9), 


(49) LAI; b, c= (Case 0), 


1— 1—r\? 1 
where q(r; b, c)=I (Case q). 


In particular, the values of L.(I; 6, cy) and L.(I; 6, c,) are obtained from the 
first and last parts of (49) by putting r=b. Furthermore, we find from (49) 
that 

21+ 21 — 

that is, both the left- and right-hand derivatives at ¢, and ¢, have these val- 
ues. Thus L,(J; 6, c) exists everywhere and is continuous. 


L-(I; b, = — 


(1 — ¢) 


1942] BOUNDED UNIVALENT FUNCTIONS 437 


It is evident from (49) that L.(J; 6, c) <0 in all cases. Hence from (29) 
we see that the smallest J for given a and b is obtained for c=c,; we thus 
verify again that the hyperbolic expansion is maximized in this case. This does 
not however tell when d itself is largest. For this purpose, we have to maximize 
M(I; 6, c). From the definition (44) we find that 


(51) M.(I; b,c) = — 6, c). 
We must investigate the sign of M.(I; 6, c) in each of the cases (47). 
Case p. We see at once that M,(I; b, c)>0, so that the maximum value 
of M(I; 6, c) does not occur in this interval. 
Case o. We find from (50) that 


(52) b,é) >0, M-(I; >,=,< 0 according as é,<,=,> 1/2. 


Now M.(I; }, c) is seen to be decreasing, so that it is negative, if at all, ina 
subinterval abutting ¢,. Hence M(JI; b, c) is monotone increasing in the in- 
terval if €,<1/2, while if ¢,>1/2 it increases to a maximum and then de- 
creases. In the latter case, the maximum is at a point c>1/2. For the condi- 
tion €,>1/2 is equivalent to g(1/2; 6, 1/2) >I or 3 log 2b>T; the condition 
M.A(I; 6, 1/2) >0 reduces to the same form if c=1/2 comes in Case o, and is 
trivial if it comes in Case p. 
Case g. The condition M.(I; b, c) >0 is seen to reduce to 


(53) (1 + c)? < (1+ 7)?/2r where g(r; b, c) = I. 


Since r increases with c, we see that (53) is more likely to be true the smaller c 
is. Hence M(I; 6, c) first increases and then decreases, or else is monotone 
increasing or decreasing. It starts to increase if ¢,<1/2, and increases 
throughout the interval if 


(54) (1 + cg)? S (1 + 5)?/26. 


We note also that (53) is certainly true if c<2'/?—1, since the right side is 
more than 2; and it is certainly false if c21/2, since then r2c 21/2, so that 
(1+7)?/2r 

Putting together the results from the three cases, we see that either 
M(I; 6, c) is monotone increasing in the whole interval c,ScSc,, or else 
it first increases and then decreases. Its largest value is at a point c satisfying 
the following conditions: c>¢é,; either €,<c<1/2, €é,=c=1/2, or €,>c>1/2; 
c=c, if and only if (54) is true; and if c#c, then c>2"/?—1. 

The conditions involved here may be expressed in terms of a and b. In 
the first place, 


(55) <,=,> 1/2 according as a <,=,> 1/8b*. 
Also, (54) with c, determined as the root of (39) is equivalent to 


438 R. M. ROBINSON ~ [November 


6.240 (1+ 5)? 


Substituting the value of c, this becomes 
(1 — 6)?[2b(3 — 2b + + (1 + 5)*(2b)*/2) 
b(1 — 6b + b*)? 
Using the form (56), we see that this condition is true whenever 5 $1/2, since 
then c21/2; but that is is not true for )>1/2 and a near 1. 


We now state the best upper bound for d in the various cases. The cases 
are distinguished according to the point c where M(J; b, c) is largest. 


(56) 


(57) 


t 


Fic, 1 


Case 1. cS é,. This is true if ¢,21/2, or if a2 1/85’. The value of c is found 
from M,(I; 6, c)=0, using the formula of Case o. It is easily seen that the 
bound may be written in the form 
Ob 

Case 2. é,<c<c,. This is true if a<1/85* and (57) is false. Here 
log d og — p(r; b, c) 


9 2 
(59) where g(r; = and (1+ = 
a 


b b 
log — where log — = 
1 


2r 


Case 3. c=c,. This holds if (57) is true, and in particular if b<1/2. We 
then have the result (38), which was previously obtained under more restric- 
tive conditions. 

The values of a and b for which the various cases hold are shown in Fig- 
ure 1. 


é 


1942] BOUNDED UNIVALENT FUNCTIONS 439 


7. Relations between a, c, d. The lower bound for d in terms of a and c is 
obtained from (42) by taking 6 as large as possible, consistent with (31). If 
(40) is satisfied, then there is a largest value possible for b; the required lower 
bound for d is (42) with 6 determined from (37), which is seen to be the same 
as (36) with the same value of b substituted. If (40) is false, then ) may be 
arbitrarily near to 1, and (42) shows that there is no positive lower bound for 
d. Thus our previous results for this case are checked. 

We now turn to the problem of finding the upper bound for d in terms 
of a and c. To do this, we have to eliminate b from (43). This turns out to be 
the most difficult problem of all. In order to prevent this section from being 
unreasonably long, we shall omit a number of calculations; but some of these 
are quite similar to those given in §8 for the case of unbounded functions. 

Since L(J; b, c) is defined in (30) by different formulas according to which 
of the cases (28) holds, we must now consider in what intervals b must lie 
in order that each of these cases may apply, when J and ¢ are given. It is clear 
from (22) that all four functions 


p(d; b, c), p(c; b, c), q(c; b, c), q(b; b, c) 


are increasing functions of b. As b increases from c to 1, the four functions 
increase from 0 to certain limiting values. 


2 
P(1; 1, c) = log a p(c; 1, = 
Cc 


1 
og —, 
1+¢ 
(60) 
1+c 1 
g(c; 1, = log—, 


We wish to determine values of b,, 5,, 5,, b,, not greater than 1, which satisfy 
(61) P(by; bp, c) = P(c; by, = gc; ba, c) = q(bg; bg, c) = I, 
so far as this is possible. We can always find b, <1; and we can find 


6,81 if 5,51 if a2 ci-o/ur), 


(62) b, S1 if a2& 


The quantities b,, by, 5,, b,, so far as they exist, are seen from (23) to satisfy 
the inequalities 


(63) b, > b, > 5, > by. 


If any one of the quantities does not exist, we shall treat it in inequalities 
as if it were more than 1. For example, 5, >1 would mean that no 5,51 can 
be found. If 6,21, then 625, is impossible, since b <1 in any case; and d<5, 
would impose no condition on 6. With this interpretation, the three cases 
(28) take the form 


R. M. ROBINSON 


Case p. 6,262 By. 
(64) Caseo. 6,2 b2 

Caseg. > b= by. 
For some values of a and ¢, not all the cases occur. 


We now calculate L,(J; 6, c) in each of the three cases. Besides (48) we 
have 


qv(r; b,c) = 


(65) polr; b, 


1-r 
b(1 +7)’ 


From these we find that 
2(1 + 1) 
b(1 — r) 


2 b 
— log — (Case 0), 


where p(r; b, c) = I (Case ), 


2(1 — r) 
b(1 + 7) 


We see that at b, and 5, the derivative has the same value to the left and to 
the right. 7 

It is clear that L,(J; 6, c) >0 in all cases, so that L(J; 6, c) is a monotone 
increasing function of 6. Hence by (29), the smallest value of J for given a 
and c is obtained by taking 5 as small as possible. We thus verify that the 
hyperbolic expansion is maximized in this way. 

We next consider the behavior of L(J; 6, c) as 6-1. In order for b—1 to 
be possible, we must have b,2 1. It may be shown that L(J; b, c) approaches a 
finite limit if b,>1, and that L(Z; b, c)+2 log (1—5) approaches a finite 
limit if b,=1. From this we find that 


M(I; b,c) as b—1 if 6, > 1, 
M(I; b,c) > — © as b—> 1 if 6, = 1. 


where g(r; b,c) = I (Case q). 


(67) 


The first formula shows that no upper bound for d in terms of a and c can be 
found if 6, >1. On the other hand, if =1, then d->0 as b-1. In fact, if dnin 
and dmax denote the smallest and largest values of d for given a, b, c, it may 
be shown that if a and c have fixed values such that 6, = 1, then 


(68) 4/€ as b—> 1. 


Wé now turn to the consideration of 


26 
(69) b, c) 1 — LI; b, ¢). 


440 [November 


1942] BOUNDED UNIVALENT FUNCTIONS 441 


From (66) we see that the condition M,(I; b, c) <0 at the points by, by, 5,, by 

reduces to 

(70) b,<1, b,< ((1+0)/2)"*, b, < ((1 b, < 1/2, 

respectively. Since ~(b,; b,, c)=I, the first condition is equivalent to 

p(1; 1, c)>TI, and similarly for the others. The four conditions become 

a> 4c/(1 + a> (2c?/(1 + 

> (2c?/(1 — c)) a> c/2(1 — )?. 

These conditions are satisfied above the curves which are denoted by #, §, 9, q, 

respectively, in Figure 2. We wish to know that the curves have the relative 

position shown in the figure. To verify that curve # lies below curve p, we put 
(c) = (3 + Sc) log 2 + 4c log c — (3 + 5c) log (1+ 0), 


and have to show ¢(c) >0 for 0<c<1. Calculating the first and second de- 
rivatives, we find that’¢’’(c) >0 and ¢$’(1) =0, hence $’(c) <0 for 0<c<1; 
then since ¢(1) =0, we find that ¢(c) >0 in the interval. By means of similar 


(71) 


considerations, we can show that each other pair of curves intersects in ex- . 


actly one point, and then by numerical calculation it is easily seen that the 
points of intersection lie as shown in the figure. . 

By a detailed study of the three cases (64), it may be shown that 
M,(I; 6, c) <0, if at all, in a single subinterval of the whole interval b, 2b =),. 
It is clear then from (70) and (67) that a necessary and sufficient condition 
that M(I; b, c) should be monotone decreasing is that },<1 and b,<1/2. 
It is seen that this is also the condition that the largest value of M(JI; b, c) 
is for b=b,. 

It may be shown further that M(J; }, c) is decreasing in some subinterval 
for (c, a) above the heavy broken curve in Figure 2. This curve is tangent to @. 
The equation of the curve to the left of the point of tangency is 


K(i—o? 


where K =2.31--- isaconstant. This part of the curve together with g and g 
bound a region where M(I; c) is decreasing somewhere between and 
The equation of the curve to the right of the point of tangency is 


(73) e= [1+ 7— (27+ 1/2 


where as usual I=log 1/a. This curve with g and # bounds a region where 
M(I; b, c) is decreasing somewhere between 5, and 5,. 

- The three heavy curves in Figure 2 divide the unit square into 5 regions 
which are numbered from 1 to 5. In these, M(JI; b, c) has the following be- 
haviour: 1, decreasing throughout; 2, increasing and then decreasing; 3, de- 


(72) a 


- 


442 R. M. ROBINSON ~* [November 


creasing and then increasing; 4, increasing, then decreasing, then increasing 
again; 5, increasing throughout. 


a 


Fic. 2 


We come finally to expressing the upper bound for d in terms of a and c. 
Only if 6,31 is there any upper bound for d. In region 1, for b,$1 and 
b, $1/2, the largest d is attained for the smallest possible }, as was mentioned 
in (41). The result is then (38) with b determined from (39). Region 2 is di- 
vided into two parts by g. In the small region to the left we have 

— 


1 
(74) log d log — p(1 — 26%; 


where b is the smaller root of g(1—2b?; b, c) =I. To the right we have 


—¢ b2 


(75) log d S log 


b 


where 6 is the smaller root of log b/c =b?I/(1—b?). 

Finally, we restate the most striking result: If values of a and c are given 
for which b cannot have values arbitrarily near to 1, then there is a maximum 
possible value for d; and the same is true for values of a and c which are limits of 
such values. But for all other values of a and c,d may have arbitrarily large values. 
The surprising part is that there is a sudden jump from one case to the other, 
rather than a gradual transition. 

8. Appendix. As a supplement to our study of bounded univalent func- 
tions, we now consider univalent functions which are not supposed bounded. 
The results of this section are obtained from those of §5 by a suitable passage 
to a limit; no use is made of §§6 and 7. 

Let F(z) be a function which is regular and univalent for |z| <1, and for 


| 


1942]. BOUNDED UNIVALENT FUNCTIONS 443 


which F(0)=0 and F’(0) =1. For any fixed 90 in the unit circle, we put 
[1] b =| x0, D =| F'(z) |. 


We shall study the relations between b, C, D. Individually, the quantities 
satisfy only the inequalities 


[2] 0<C,. 0<D. 


Now any such F(z) can be approximated by bounded functions of the 
same type, if the bound is allowed to vary. Hence any possible values of 
b, C, D can be approximated for bounded functions. Conversely, on the basis 
of [5], the univalent functions form a normal family, so that any values of 
b, C, D with 0<d<1 which can be approximated can also be assumed. Hence 
the values possible for 6, C, D are those attained for bounded F(z) together 
with the limit points satisfying 0<b<1. 

If F(z) ig bounded, we can choose a >0 so that |aF(z)| <1 for |z| <1. We 
then put 


[3] f(2) = aF(2), 


so that f(z) is a function of the class previously considered, and f’(0) =a. Thus 
a and b have the same meaning as before, and 


[4] c=aC, d=abD. 


From the known relations between a, b, c, d, we obtain the relations between 
a, b, C, D; by eliminating a, we find the relations between b, C, D. 

Relations between } and C. If in formula (31) we put c=aC, and then let 
a—0, we obtain the well known inequalities(®) 


b b 
[5] <C< 


(1 + 5)? 


The bounds are attained for the function 


(6] 
for zo<0 and z9>0, respectively. The function [6] maps |z| <1 onto the 
w-plane excluding those points for which w< —1/4. 

Relations between b, C, D. We remark first that the required inequalities 
are not obtained from the relations between b, c, d by passing to a limit. 
If in (34) we put c=aC and d=aD, and let a—0, we obtain Nevanlinna’s 
result(’) 


(*) See for example L. Bieberbach, Lehrbuch der Funktionentheorie, vol. 2, chap. 1, §9. 
(7) R. Nevanlinna, Uber die konforme Abbildungen von Sterngebieten, Finska Vetenskaps- 
Societeten Férhandlingar, vol. 63 (1921), no. 6, p. 18. 


444 R. M. ROBINSON 


[7] 
b(1 + 5) b(1 — 5) 


This inequality gives the best bounds for D/C in terms of b. It does not how- 
ever give the best bounds for D in terms of b and C, since either of the 
equalities in (34) is attained only for a certain positive value of a. 

The sharp lower bound for D in terms of 6 and C is easily obtained. In 
(42) we put c=aC, d=aD, and let a—0, which gives 

[8] Dz 
b? 

It may be verified that the equality is attained for a function mapping |z| <1 
on the w-plane with slits along the positive and negative real axes. For a given 
b, the slits may be so chosen that C has any possible value. The equality is 
attained for any positive or negative Zo. 


We turn now to the problem of finding the upper bound for D in terms of 
band C. We start by introducing the functions 


147 


P(r; b, C) = lo 
[9] 


r 


Q(r; b, C) = log + 


for 0<r 3b; for r=0 we put 
[10] P(0; b, C) = Q(0; b, C) = log b/C, 


which is the limiting form of [9]. The functions (21) are related to these by 
the equations 


a 
[11] P(r; b, c) P(r; b, aC) P(r; b, C) log (1+ aC)?’ 


a 
g(r; b, c) = g(r; b, aC) = Q(r; b, C) — log Gaecy 
The function P(r; 6, C) is a decreasing and Q(r; b, C) an increasing function 
of r; both increase with 6, and decrease as C increases. The partial derivatives 
have the values 

—2 b 


P, = P 39, ’ 


2 b 1 
Q,(r; b, C) (1 pat log > b, C) — 7)’ Qe(r; b, C) 


1 
Pe(r; b,C) = — 


[12] 


b 
r 
1+r b 
r 


1942] _ BOUNDED UNIVALENT FUNCTIONS 


The three cases (28) take the following limiting forms as a—0: 
[13] P(b;b,C) PO;5,C), P0;b,C) 205 Q(0;5,C), 
Q(0; 6, C) S$ 0 S Q(b; b, C). 
These are equivalent to 
[14] Csb, C=b C2b, 


respectively, where C of course satisfies [5]. 
In Cases p and g, we determined r from p(r; b, c)=I and g(r; 6, c)=I, 
respectively. These equations are the same as 


[15] P(r; b, C) + 2 log (1 + aC) = 0, Qlr; b,C) + 2 log (1 — aC) = 0. 
As a—0, their roots approach those of 
[16] P(r;b,C) = 0, Q(r;b,C) = 0. 
Putting c=aC and d=aD in (43), and letting a—0, we find that 
[17] log D < M(b, C) 


where 


[18] M(b, C) = log —— L(6,0), 


L(b, C) being defined for all possible values of b and C by 


Q(r; b, C) where r satisfies P(r; b,C) = 0, if C Sb, 
P(r; b, C) where r satisfies Q(r; 6b, C) = 0, if C 2 b. 


These two formulas correspond to Cases » and g in (30). For the case C=}, 
either of these, and also Case oa, leads to the result 
1 


1— 


[19] = { 


It may be verified that the equality in [20] is attained for a function F(z) 
which maps | z| <1 on the w-plane slit to infinity at one or both ends of the 
perpendicular bisector of the segment joining 0 and f(zo). We see from [8] and 
[17] that when C has its smallest value, we must have D=(1—b)/(1+0)*, 
and that when C has its largest value, D=(1+5)/(1—5)*; in these cases the 
equalities in both [8] and [17] are attained for the function [6]. We know 
that in any case, there is some function for which the equality in [17] is at- 
tained; but the extremal function does not seem to be of a very simple sort 
except in the three cases mentioned. 

Relations between b and D. From [8] we see that D has its smallest value 
when C has its smallest value, determined from [5]. This gives 


446 R. M. ROBINSON 


1-5 
D2 
(1 + 5)* 
The equality is attained for the function [6] with 29<0. 
We next determine the upper bound for D. In the first part of [19], as C 
increases, r decreases, and hence Q(r; 5, C) decreases; in the second part, as C 
increases, r increases, and hence P(r; b, C) decreases. Hence in either case, 


L(}, C) is decreasing, or M(b, C) is increasing. Thus the largest value of D is 
obtained when C has its largest value, and hence 


[22] 
(1 — 6)? 


The equality is attained for the function [6] with zo>0. 


We may also verify that M(b, C) is increasing by calculating its partial 
derivative. Making use of [12], we find that 


1 1+ 7r\? 
ral where P(r; 6,C) = 0, if C Sb, 


[23] C) = 


1/1 — r\? 
. = > 
C —) +1] where Q(r; C) = 0, if C = 5, 


so that Mc(b, C) >0 throughout. 


The inequalities [21] and [22] (“distortion theorem”) are well known, and 
were used to derive [5] in the original approach to this subject(*). 

Remarks about D/C and D/C?. From [8] we see that D/C has its smallest 
value when C is smallest. From [23] we see that M(b, C)—log C is an in- 
creasing function, and hence D/C has its largest value when C is largest. Thus 
we are again led to [7]. 

On the other hand, [8] shows that D/C? can reach its smallest value for 
any C. From [23] we see that M(b, C)—2 log C has its maximum for C=), 


so that D/C? attains its largest value only in this case. We obtain the in- 
equalities 


[24] 


OD 1 
<— <—__. 


This result gives the bounds for the derivative of 1/ F(z), or for the derivative 
of a function univalent in the exterior of the unit circle and leaving © fixed. 
The problem was solved in this form by Léwner(*) (without using the 
“method of Léwner”). 


(8) See Bieberbach, loc. cit. 
(°) K. Léwner, Uber Extremumsitse bei der konformen Abbildungen des Ausseren des Ein- 
heitskreises, Mathematische Zeitschrift, vol. 3 (1919), pp. 65-77. 


[November 


1942] BOUNDED UNIVALENT FUNCTIONS 447 


Relations between C and D. To find the lower bound for D in terms of C, 
we must eliminate b from [8]. Evidently the larger } is, the smaller D may 
be. Now 3 is restricted only by [5]. If C21/4, b may have values arbitrarily 
near to 1, and hence there is no positive lower bound for D. But if C<1/4, 
then the lower bound for D is given by [8] with b determined from b/(1+5)? 
= (C. Solving for b and substituting gives 


[25] 2D = 1— 4C + (1 — 2C)(1 — 4C)'”2 for C < 1/4. 


The equality is attained for the function [6] with zo <0. 

We now turn to the problem of finding the upper bound for D in terms of 
C. We first consider the behavior of L(b, C) as b-—>1. In order that b-—>1 
should be possible, we must have C21/4. If C21, then the second part of 
[19] applies. As b—>1, r decreases, and L(b, C) increases to a finite limit. If 
C <1, then the first part of [19] applies for b near 1. As b—>1, r increases, and 
L(b, C) increases; and L(b; C) approaches a finite limit unless r—1. Now if 
r—1 as b—>1, we must have P(1; 1, C) =0, or C=1/4. Hence if C>1/4, L(b, C) 
increases to a finite limit as b—»1. The case C=1/4 remains to be considered. 
Here r is determined from P(r; 6, 1/4) =0, which is equivalent to 


1 1 
[26] log b = logr + 
4r 


From this we find that 
[27] 1—r~2(1i — 
Using this in the formula L(}, 1/4) =Q(r; b, 1/4), we find that 
[28] L(b, 1/4) + 2 log (1 — 5) > 1 
From these results we find that 
M(b,C) + as if C > 1/4, 
M(b,C) — as b—>1 if C = 1/4. 


The first formula shows that no upper bound for D can be found if C>1/4. 
The second-part may be written more accurately as 


[30] M(b, 1/4) — log (1 — 6) ~ — 1 — log 2 as b— 1. 


If we denote by Din and Dmax the smallest and largest values of D which are 
possible for given b and C, then for C=1/4 we have from [8] and [30] 


[29] 


[31] Dmax ——; as 1. 
Dain é 


‘ We now consider the derivative of M(b, C). Using [12], we find that 


R. M. ROBINSON [November 


2b 2(1 — 7) 
2 
1-8 


Hence at b=C, M,(b, C) has the same value to the left and to the right. 

A remark which will be useful below is the following. For any fixed 6 the 
root 7 of P(r; b, C) =0 decreases from 6 to 0 as C increases from its smallest 
possible value to b; and the root r of Q(r; 6, C) =0 increases from 0 to b as C 
increases from b to its largest possible value. Hence if 0Sr 3b, the equation 
P(r; b, C) =0 determines a value of CSb, and Q(r; b, C) determines a C2), 
both satisfying [5]. 

Now the equation 


where Q(r; b,C) = 0, if BSC, 
[32] C) = 


where P(r; b,C) = 0, if B2C. 


M,(b, C) = 0 


requires in the first case that r=1—26? and in the second that r=2b?—1- 
Since we must have 0Sr3b, M,(b, C) vanishes only in the following cases: 


if 1/2 S$ 6 S 2-" and Q(1 — 6, C) = 0, 
if 2/7 S6<1 and P(2b? — 1; 6,C) = 0. 


[33] M,(b, C) = of 


These may be combined in the statement that M,(b, C)=0 only along the 
curve 


[34] 4C = 4 — 252 |2-1/08, 1/25b<1, 


which from the preceding paragraph must lie between the bounds [5]. To 
study this curve, we put h=1/0? and 


o(h) = 2 log 4C. 
Then 


o(h) = (h + 1) log h — 2(k — 2) log | A — 2, 


where for h = 2 we interpret the right side as its limiting value 3 log 2. Differ- 
entiating, we have 


We see that $’’(h) >0 for 1<h<2, and $’’(h) <0 for 2<h <4. Using this we 
find that $’(h) increases from 0 to + © as h increases from 1 to 2, and then 
decreases to —3/4 as h increases to 4. We must have $’(h) =0 at some point 
between 2 and 4, and in fact for h =3.27 - - - ; and here ¢(h) has its maximum. 
Therefore, in [34], Cincreases from 2 when b= 1/2 toa maximum K =2.31 - - - 
for b=0.55 +--+ , and then decreases to 1/4 as b increases to 1. Figure 3 shows 


448 


1942] BOUNDED UNIVALENT FUNCTIONS 449 


[34] as a broken curve and the bounds [5] as solid curves. 

The curve [34] divides the region defined by [5] into two parts. There is 
no difficulty in seeing that My(b, C) <0 in the lower part, and Ms(b, C)>0 
in the upper part. Hence M(b, C) is a decreasing function of b if C=1/4; if 
1/4<C3sz2, it first decreases, and then increases to + ©; if 2<C<K, it first 


increases, then decreases, then increases to + ©; and if C2K, M(b, C) in- 
creases throughout, and approaches + © as b—»1. There is an upper bound 
for D only if C<1/4, and then it is attained when 5 has its smallest possible 
value. Hence the bound is given by [22] with 6 determined from b/(1—6)?=C. 
Substituting this value of b, we find that 


[35] 2D 4C + (1 + 2C)(1 + forC 1/4. 


The equality is attained for the function [6] with z9>0. It is to be noted that 
the upper bound for D in terms of C increases from 1 to 2.06--- as C in- 
creases from 0 to 1/4, and then jumps to +. 


UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


\ 
2 (3,24 4 
\ 
\ 
4 
\ 
1 1 
\ 
7i 1 
178 
} 
Fic, 3 


ON THE OSCILLATION OF DIFFERENTIAL TRANSFORMS. I 


BY 
G. SZEGO 


1. INTRODUCTION 


1.1. In a recent paper G. Pélya and N. Wiener(!) studied the relation 
between the analytic character of a real periodic function and the number of 
the sign variations of its derivatives. The purpose of the present paper is to 
develop another way of attacking this problem different from that used by 
the authors mentioned. It leads to a new proof of Theorem 1 of the paper of 
Pélya and Wiener and to refinements of their Theorems 2 and 3 which are in 
a certain sense best possible results(?). 

Let f(x) be a real periodic function with period 27 for which all derivatives 
f(x) exist. We denote by 2; the number of the mod 27 distinct values of x 
for which a sign variation of f(x) takes place. In what follows we give first 
a new proof of Theorem 1 of Pélya and Wiener. A further, more elaborate, 
application of our method leads to the following results which correspond to 
the Theorems 3 and 2, respectively, of the authors mentioned. 


THEOREM A. Let Ni <k/log k provided k is sufficiently large. Then f(x) is 
an integral function. 


THEOREM B. Let p>1 and let Ni <(k/p)'*/2 provided k is sufficiently large. 
Then f(x) is an integral function of order not greater than p/(p—1). 


The following results are more informative. 
THEOREM A’. Let for sufficiently large k 
k (1 + log log k — a) 
log k log k 
where w(k)—>+ ©. Then the conclusion of Theorem A holds. 


THEOREM B’. Let p>1 and let p be a positive number such that pp**¥*>1. 
If for sufficiently large k 


(1.1.1) 


log k 
(1.1.2) Me < ) 


Rie)’ 


then the conclusion of Theorem B holds. 


Presented to the Society, November 22, 1941; received by the editors January 2, 1942. 

(4) G. Pélya and N. Wiener, On the oscillation of the derivatives of a periodic function, these 
Transactions, vol. 52 (1942), pp. 249--256. 

(?) See the counterexamples given below, section 7. 


450 


‘ 
> 
3 


DIFFERENTIAL TRANSFORMS. I 451 


The following result contains Theorem A’ (therefore also Theorem A): 
THEOREM A’’. Let H be a constant such that 

(1.1.3) H + log ((1/2) log 2) > 0, 

and let for sufficiently large k 


log log k — 
(1.1.4) Mm < 
log k log k 


Then the function f(x) is analytic in the strip 
(1.1.5) | 3f(x) | < H + log ((1/2) log 2). 


1.2. In various conversations Professor Hille suggested certain analogues 
of these problems considering #'f(x) instead of f‘*”(x) where @ is a given sec- 
ond order differential operator satisfying suitable conditions(*). In the last 
part of the present paper we illustrate the further applicability of our method 
by discussing the special operator 


(1.2.1) = (1 — — D = d/dx, 


the “characteristic functions” of which are the classical Legendre functions. 
Let f(x) be a function having derivatives of all orders in —1S5x5+1 and 
let N= Ne: denote the number of the sign variations of #‘f(x) in the interval 
—1<x<+1. Then we prove(*) 


THEOREM C. If Ni SN, k sufficiently large and even, then f(x) must be a 
polynomial of degree less than or equal to N. 


TuHeEoreM D. If N;, satisfies the condition of Theorem A’’, k even, then f(x) 
is analytic in an ellipse with foci at —1 and +1 the sum of the semi-axes of which 
4s 


exp {H + log ((1/2) log 2)}. 


These results correspond to Theorem. 1 of Pélya and Wiener and to Theo- 
rem A’’, respectively. The analogue of Theorem B can also be dealt with. 

1.3. In what follows we give the proofs of the results formulated above. 
Section 2 contains a new proof of Theorem 1 of Pélya and Wiener; the under- 
lying idea of this proof is used throughout the present paper. Section 3 con- 
tains the proof of Theorem A’’, section 4 that of Theorem B’. Sections 5 
and 6 are devoted to the proofs of Theorems C and D involving Legendre’s 
operator. Finally in section 7 certain counterexamples are exhibited which 


(*) See below, pp. 463-497. 
(*) The proof furnishes the conclusion of Theorem C under the condition that N;3 N holds 
for an infinite number of k values. (The same holds in section 2.) 


452 G. SZEGO . [November 


show that the conditions of Theorems A and B on WN; cannot be replaced 
by N.=O(k) and N,=O(k*), a>1/p, respectively. 


2. NEW PROOF OF THEOREM 1 OF P6LYA AND WIENER 
2.1. Let 


+0 
(2.1.1) f(x) = Cy = ,, 
and let 2N; be the number of the mod 27 distinct sign variations of f(x). 
We assume that k goes to + © through a sequence of integers such that N; 
has a constant value N. Then we show that f(x) is a trigonometric polynomial 
of degree less than or equal to N, that is, we prove cvyim=0, m>0. 

Let x1, denote the mod 27 distinct sign variations of f(x), 
that is, the values of x for which f(x) changes its sign; x,=x,(k). Let a 
be real and (°) 


‘ 
u(x) = sin - sin r - sin r (1 + cos” (x + a)) 


(2.1.2) 


N+m 
= Uye”*, = tly. 
Nom 
(In case N=0 we write u(x) =1+ cos” (x+a).) This is a trigonometric poly- 
nomial of the fixed degree N+™m, the sign variations of which are the same as 
those of f(x). The coefficients u,=u,(k) are bounded as k->~; this can 
easily be showed by multiplying out the expression 


2N 
(2 3) u(x) = 2-2N II [2etz/24. (1 ¢—iag—iz)m) 


Also we obtain for the highest coefficient of u(x) 
(2.1.4) = (— exp {— i> x,/2 + ima} 
= (— 1)¥2-29-™ exp { ixo + ima} 


where the real quantity x»=xo(k) depends on k but it is independent of a. 
2.2. Let cv4m#0. The sign of 
1 +f N+m 
(2.2.1) f(x)u(x)dx = (iv)*em, 
2x 
is independent of a, positive say. We determine a in such a way that the last 
term 


(5) We could use as well 1-++cos m(x+a) or (1-++cos (x-++-a))™ instead of 1+-cos” (x+a). 


ag 


1942] DIFFERENTIAL TRANSFORMS. I 453 
(2.2.2) (i(N + 
= (i(N + m))*csm(— exp { — ixe — ima} 
becomes real and negative. Then 
N+m—1 N+m—1 
(2.2.3) 2(N+m)*| DY v*0(1) 


follows where the bounds O(1) are independent of k. But this involves a con- 
tradiction for sufficiently large k. 


3. Proor or THEOREM A”’ 


3.1. We start with some preliminary remarks. 

(a) The constant H must be positive since log ((1/2) log 2) <0. 

(b) Let x=o0+4it, o and ¢ real, and let T denote the unique value such 
that (2.1.1) converges for |¢| <7 and diverges for |t| >T (or T is the largest 
value such that f(x) is analytic in the strip |t] <T). We have 


(3.1.1) lim sup | c, |” = 


The modifications necessary for T=0 or T= © are obvious. 
Now another form of the assertion of Theorem A” is 


(3.1.2) T = H + log ((1/2) log 2). 


(c) Theorem A’ is obviously a consequence of Theorem A’’. 
3.2. We assume 


(3.2.1) lim sup | c,| = + 


and show that 
(3.2.2) _ 2 H + log ((1/2) log 2). 


From (3.2.1) we conclude in a well known manner(*) the existence of a 
sequence of integers { M} such that 


(3.2.3) | >| c,| +v=0,1,2,---,M—1. 


Now let ¢ be an arbitrary but fixed positive number. We define a sequence of 
integers k=k(M) by 


(3.2.4) k = k(M) = [M(log M+ H — 6]. 
(*) Cf. G. Pélya and G, Szegé, Aufgaben und Lehrsdtze aus der Analysis, vol. 1, 1925, p. 18; 


p. 173, Problem 107. What is needed here is much less than the lemma used by Pélya and 
Wiener, loc. cit., p. 252. 


454 G. SZEGO 


Then an easy calculation shows that 


k log log k — 
(1 + 
log k log k 
log M log M 
log log M — H log log M 
log M log M 


log log M 
(3.2.5) = +0( <M 
log M log M 


provided M is sufficiently large. 
3.3. Let us denote again by x1, Xav, the mod 27 distinct 
sign variations of f(x), x,=x,(k); here k=k(M). We write, a@ real(’), 


x— — x — Xen 
u(x; M) = sin sin +++ sin rn (1 + cos” (x + a)) 


N+m=M 


(For N=0 we omit the sine factors.) Since N <M, m is positive. The trigo- 
nometric polynomial u(x; M) is of degree M and it has the same sign varia- 
tions as f(x). We prove 


Lema 1. Let the trigonometric polynomial u(x; M) be defined by (3.3.1) and 
let 


+M +M 
(3.3.2) U(x; M) = (cos (x/2))?%(1 + cos™x) = >> Uye** = U,cosvz; 


then the inequalities 
(3.3.3) | w| < U,, y=0,1,2,---,M, 
hold, with the sign “=” forv=M. 

Indeed as (2.1.3) shows, the coefficients u, of u(x; M) are multilinear func- 
tions of e+*(*+*)/2 and e+ with non-negative coefficients. Thus we do ‘not 
decrease | 24»| by replacing the quantities e*‘‘*+*)/2 and e+ by 1, or by re- 


placing the constants x, by —z and a by 0. This leads precisely to (3.3.2). 
The assertion regarding |u| = Uy is also clear. We have (see (2.1.4)) 


= (— exp + ima}, xo = — x,/2, 
Uy = = 
(7) See Footnote 5. 


(3.3.4) 


[November 
(3.3.1) 
= ue, 


1942] DIFFERENTIAL TRANSFORMS. I 


3.4. Now 


1 +r +M 
(3.4.1) — f®(x)u(x; M)dx = D> (iv)*cu_, 


has a sign independent of a. Choosing a in a proper way (cw#0) and using 
(3.3.3) we obtain : 


M-1 


(3.4.2) vu, = 2M*| cw| 


Taking (3.2.3) and the inequality x Se*-' into account we find 
M-1 
2Um S exp {(|»| /M — 1)k + (M —|>|)y} 
va—M+1 


(3.4.3) 


M—1 M—1 


From (3.2.4) we conclude that Q=Q(M)— © or more precisely Q(M) 
as (The symbol a(M)=b(M) means that a(M)[b(M) 
—1as M—~o.) Introducing 


(3.4.4) Tilt) 


where T |, ;(&) is identical with the Tchebichef polynomial, by virtue of (3.3.2), 
we can write (3.4.3) in the following form: 


(3.4.5) U,T = 2-%(1 + &)*(1 + — 2UmTu(é); 
hence 
2UmQ™ 2-*(1 + &)¥(1 + &*) 
or (cf. (3.3.4)) 
(3.4.6) (14+ +6") < 6). 
Now let M—. Then 
(3.4.7) = 1 
since Q-?=O(M-*). Further §-'M-—>2e-#+**7 so that 
(3.4.8) 2 exp { 


follows. Since ¢ is arbitrarily small, this involves (3.2.2). 


4. Proor or THEOREM B’ 
4.1. Let the order \ of f(x) be greater than p/(p—1). Then using the pre- 


456 


vious notation (2.1.1), 


log log 1/| ¢, 
(4.1.1) lim in a /| | = 
voto log v 


holds(*). Consequently 
(4.1.2) lim sup | c,| e” = ©, 
voto 


We obtain now instead of (3.2.3) the inequalities 
(4.1.3) | cur| >| c,| ell, +v=0,1,2,-++,M—1, 


holding for a certain sequence { M} of integers. 
The previous proof needs only unessential modifications. 
4.2. We write 


(4.2.1) k = k(M) = 


where g is a fixed constant satisfying the conditions 
(4.2.2) 1/p <q < 
An easy calculation shows that for large k 


log k 


ki-1/e 


log M log M 
log M log M 
-(1— pot! 
(1 
= u(i- +0( =))< M. 


Using the same notation and the same argument as in §3.4 we obtain 
instead of (3.4.3) 


(4.2.3) 


(4.2.4) 2Um U, exp {(|»| /M — 1)k + Me —| 


Since M’—|v|*<(M-—|»r|)pM*- we find as before 


M-1 
(4.2.5) 2Uus UAR +R”), R = 
ym—M+1 


(8) See Pélya and Wiener, loc. cit., p. 254. 


|| G. SZEG6 . [November 
< 


1942] DIFFERENTIAL TRANSFORMS, I 


On account of (4.2.1) we have R= R(M)2M**— @ as 
Now let 

(4.2.6) =(M) = (R+ R")/2- 

then we obtain (cf. (3.4.6)) 

(4.2.7) + 97”). 

But pq>1 so that (1+R-*)-¥-—1. Moreover 

(4.2.8) = 2M" 0. 

This furnishes the contradictory inequality 21. 


5. Proor or THEOREM C 


* 5.1. The proofs of Theorems C and D are based on arguments similar to 
those followed in the previous part. Instead of trigonometric series, expan- 
sions in Legendre series are used. 

Let 


(5.1.1) fla) = 
vax 


c, real, be the Legendre expansion of f(x) where P,(x) is the Legendre poly- 
nomial in the customary notation. By using the notation (1.2.1) 


(5.1.2) f(x) = > (— d)'c,P,(x), = + 1), 


follows. Let Ni = denote the number of the sign variations of #'f(x) in 
—1<x<+1 and let Ni = Noi=N be fixed as/]— © through a proper sequence 
of integers. We show then that cyin=0, m>0. 


Let x1, X2,° ++, xw be the sign variations of #'f(x) in —1<x<+1. We 
form(*) 


N+m 


(5.1.3)  o(x) = (x — — (x — ay)(1 + 5x") = 0,P,(x) 
ym 


where 6 is either +1 or —1. This is a polynomial of degree N+-m with the 
same sign variations as #'f(x).. The coefficients v, =v,(/) are bounded as/— , 
Furthermore ty 4m = 544m if h, denotes the highest coefficient in the Legendre 
_expansion of x’. 
Now 


+1 N+m 
(5.1.4) f = + 1/2)-4(— 


(*) We could use 1+-8P,,(x) instead of 1+ 4x”. See Footnote 5. 


457 


458 G. SZEGO : [November 


This expression has the same sign whether 6= +1 or = —1. We obtain by a 
suitable choice of 6 


N+m-—-1 
(5.1.5) @+1/2) m+ 1/2) Avim| 


Division through by Ayim leads to a contradiction as ]—> © unless cyim=0. 


6. PRooF OF THEOREM D 


6.1. For the proof of Theorem D we follow again the previous argument. 
Under the assumption (3.2.1) we obtain a sequence { M} of integers such that 
(3.2.3) holds. The definition of k=2/ is in this case slightly different from 
(3.2.4), namely 


(6.1.1) k = 21 = 2[(M/2)(log M+ H — 6)]. 
Then & is even and N=N,<M. Now we define m by N+m=M and 
v(x) =v(x; M) by (5.1.3). We prove 


LEMMA 2. Let the rational polynomial v(x) =v(x; M) be defined by (5.1.3) 
and let 


(6.1.2) V(x; M) = (x + 1)%(1+ 2”) = > V,P,(x); 


then the inequalities 
(6.1.3) <V,, v=0,1,2,---,M, 


hold with the sign “=” for v= M. 


It is well known(!*) that P,(x)P,(x) expanded in terms of Legendre poly- 
nomials has non-negative coefficients. Multiplying out 


N 
(6.1.4) v(x; M) = JJ (Pi(x) — a) {1 + 5(Pi(x))”} 
we see that the coefficients v, of v(x; M) are multilinear functions of —x, and 6 
with non-negative coefficients. Obviously we do not decrease | »,| by replacing 
—x, and 6 by 1 which leads precisely to (6.1.2). 
6.2. Starting from (5.1.4) we obtain (5.1.5) and 


(6.2.1) (M + S (v + 


(#9) See J. C. Adams, On the expression of the product of any two Legendre’s coeficients by 
means of a series of Legendre’s coefficients, Proceedings of the Royal Society, vol. 27 (1878), pp. 
63-71; Collected Scientific Papers, vol.1, pp. 487-496. See E. T. Whittaker and G. N. Watson, 
A Course of Modern Analysis, 4th edition, 1935, p. 331, Problem 11. 


1942] DIFFERENTIAL TRANSFORMS. I 


Now let 


(6.2.2) 1, 2, 3, - 


Then (v+1/2)g,d;"! is decreasing as v increases since 


v+1/2 (v + 1/2)(v — 1) 
y— 1/2 (v + 1) 


(6.2.3) 


Hence 
(6.2.4) + 1/2)g, > (M + 1/2)gudw » 
so that from (6.2.1) 


M-1 


(6.2.5) S >, 


But 


1/2 


is positive and increasing with y so that 


(6.2.7) S exp — Aw 


Hence 


< exp — M)An’}. 


v=0 


From (6.1.1) we conclude that S=S(M)=Se®-*-7 M as Mo. 

Writing 
(6.2.9) @, M-» @, 
we obtain by using a classical representation of Legendre polynomials("*) 
(6.2.10) PS) 2 gS", yv=0,1,2,---, 
so that from (6.2.8) and (6.1.2) 


(6.2.11) guhu VLPs) = + 99 (1 + — S™ hy Pu({). 
yan 

The representation mentioned furnishes also guhy =2-™ so that using (6.2.10) 

we again conclude 


(#2) See, for instance, G. Szegé, Orthogonal Polynomials, American Mathematical Society 
Colloquium Publications, vol. 23, 1939, p. 92, equation (4.9.4). 


5 
5 
459 
= -. 
| 


460 G. SZEGO 


or 
Now let M— ©. Then (cf. section 3.4) 
= (1+ 1, 
so that (3.4.8) follows. Consequently (3.1.1), (3.1.2) hold. 

If the coefficients c, of the Legendre expansion (5.1.1) satisfy (3.1.1), then 
f(x) must be analytic in an ellipse with foci at — 1 and +1 the sum of the semi- 
axes of which is e7('*). 

This establishes the proof of Theorem D. 

7. COUNTEREXAMPLES 


7.1. In this section we show that the conditions N,<k/log k and 
Ni <(k/p)**/2 of Theorems A and B cannot be replaced by N,=O(k) and 
Ni. =O(k*), respectively, where a>1/p. I owe the necessary counterexamples 
to a suggestion of Professor Pélya. 

7.2. The first assertion can be proved by considering the non-integral 
periodic function 


(7.2.1) f(x) = (1 — 2h cos x + h)-, O<hcl. 
We see by mathematical induction that 
(7.2.2) f(x) = ti(x)(1 — 2h cos x + h®)-* 


where ¢,(x) is a trigonometric polynomial of degree k. Consequently in this 
case N;S2k. 
7.3. Let p>1. The integral function 


(7.3.1) f(x) = > e~"? cos nx 


n=l 


is("*) of order p/(p—1) and as we shall prove N; =O(k/”). This furnishes, in- 
deed, the desired counterexample by assuming a <1 and choosing p according 
to the conditions 1/p<1/p<a; then p/(p—1) <p/(p—1). 

Let k be even. We apply Jensen’s theorem to the function 


(7.3.2) f(x) = (— 1)? nte-*” cos nx 


n=l 
in the circle |x| $2. Since 


(#2) See Szegé, loc. cit., p. 238, Theorem 9.1.1. 
(38) See Footnote 8. 


[November 


1942] DIFFERENTIAL TRANSFORMS. I 


(7.3.3) | f(x)| 
nel 


and 
(7.3.4) | f(0)| = nte-”” 
n=l 
we find for the number N(z) of the roots of f(x) in the circle | x| sr 


n=l n=l 


Obviously NV; N(7). 

In order to find a suitable bound for N(7) let us consider the function 
(v) =v*e—” of the continuous variable v, v21. It is increasing for vy <v» and 
decreasing for vy where 


(7.3.6) vo(k) = (k/p)*!?. 
The maximum of \(v) is exp { (k/p) log (k/p) —k/p } i 


The function assumes its maximum for v,=v)(k) 
= (2k/p)". 
Now let w be fixed, w>(2/p)"/?, log w—w?/2< —(log p+1)/p. Then for 


k— 


(7.3.7) T= ntewtten < |, 
nSwk!? 


Further \*(v) is decreasing for v >wk'/? >v¢ (k) so that 


n>wokl!? n=l 


= O(1) exp {k log w + (k/p) log k — w?k/2}. 

By use of the mean value theorem we find 

(7.3.9) log A(vo) log A([vo] + 1) + 

where C>0 is independent of k. But for large k 

k log w + (k/p) log k — w?k/2 < (k/p) log (k/p) — k/p — Chi!” 
< log A([vo] + 1), 


(7.3.8) 


(7.3.10) 


hence 
| II = 0(1)| #(0)|, 
(7.3.11) I+ II = f0) |. 


Consequently 


5 
461 


462 G. SZEGO 


(7.3.12) = O(1) 


from which =O(k"/”) follows. 


7.4. In case of an odd k we have f(0)=0. Then in Jensen’s theorem 
f®(0) has to be replaced by 


(7.4.1) fv) = (- 1) nktle—ne, 


n=l 


The previous argument holds good except that k has to be replaced by k+1. 


STANFORD UNIVERSITY, 
STANFORD UNIVERsITY, CALIF. 


ON THE OSCILLATION OF DIFFERENTIAL TRANSFORMS. II 
CHARACTERISTIC SERIES OF BOUNDARY VALUE PROBLEMS 


BY 
EINAR HILLE 


INTRODUCTION 


1. Formulation of problem. G. Pélya and N. Wiener [2](*) have recently 
made important contributions to the S. Bernstein problem concerning the 
relation between the frequency of oscillation of derivatives of high order and the 
analytic character of the function. Assuming f(x) of period 27 and denoting the 
number of sign changes of f(x) in the period by N;, they show that restric- 
tions in the rate of growth of N, when k— ©, imply that the high frequency 
terms in the Fourier series of f(x) have “small” amplitudes. In particular, if 
N; ts bounded, N.S WN for all k, then the high frequency terms are entirely 
missing and f(x) reduces to a trigonometric polynomial of degree at most N/2. 
Conversely, if f(x) is a trigonometric polynomial of degree K, then N,=2K for 
all large k. Their results are less precise when Nj; is unbounded. While it is 
likely that N,=O(k) is necessary and sufficient for analyticity of f(x), this 
has not yet been proved, and the best they could do was to show that 
N;,=0(k/?) implies that f(x) is an entire function. 

For these and similar questions G. Szegié has devised a new method of 
attack, presented in the first paper of this series [4]. This method showed it- 
self capable of giving more precise information when JN; is unbounded. In 
particular, Szegé could show that N;<k(log &)-' implies that f(x) is entire. 

The present paper is also closely related to the paper of Pélya and Wiener, 
but proceeds in a different direction. We aim to preserve the essence of the 
methods developed by these writers and to apply them to a wider range of 
problems. There are several features in the investigation of Pélya and Wiener 
which suggest possible generalizations, in particular, the class of functions 
considered and the operations applied to them. 

Let T be an operator which takes functions f(x) of a certain class F into 
functions of the same class. Any function u(x) of F such that Tu(x) =du(x) 
will be called a characteristic function of T corresponding to the characteristic 
value \ and any formal series > fntn(x) will be called a characteristic series of T 
if its terms are characteristic functions. 

In this terminology we can describe the investigation of Pélya and Wiener 


Presented to the Society, November 22, 1941 under the title On the oscillation of differential 
transforms. I1; received by the editors January 2, 1942. 
(*) Numbers in square brackets refer to the references at the end of the paper. 


463 


464 EINAR HILLE [November 


as follows(?). They are concerned with the differential operator D? and the 
characteristic functions of this operator determined by the periodic boundary 
value problem 


(1.1) (D?+u)y=0, 2m) = 


Any function f(x)EC™(— ©, ©), satisfying the same condition of periodic- 
ity f(x+2m) =f(x), can be represented by a characteristic series of the opera- 
tor, 


(1.2) f(x) = (ao/2) + > (a, cos nx + 5, sin nx), 


to which the operator D? can be applied termwise as often as we please. They 
observe that for \>0, D?—d is an oscillation preserving transformation in the 
sense that the transform (D*—)) f(x) has at least as many sign changes in 
the period as f(x) has. This observation is used as follows. 

Let m be a positive integer and multiply the mth term of the series (1.2) 
by the kth power of the factor 


(1.3) 


A function F(x, m, k; f) results which has at least as many sign changes in the 
period as f‘**)(x) since 


— m*)?"F(x, m, k; f) = (2m)?*f?”)(x). 


On the other hand, for large values of k the number of sign changes of 
F(x, m, k; f) can be shown to be at least 2m provided the mth term is present 
in the original expansion (1.2). This is the basis for all their conclusions. 

It is obvious from this formulation in what direction we are looking for 
extensions. Instead of the operator D? we shall consider a rather general linear 
differential operator L. In the present paper we restrict ourselves to second 
order operators satisfying certain conditions, but first or higher order opera- 
tors would also be admissible. We define a set of characteristic functions of L 
by a suitable boundary value problem for L in the basic interval (a, 6) and 
consider the corresponding class of characteristic series, F say, with the re- 
striction that L shall apply termwise to the series as often as we please. It 
turns out that the operator L—X, A>0, is always oscillation preserving in 
(a, 6) with respect to a suitable class of functions which includes F. Even the 
“root consuming factor” (1.3) has an obvious analogue in terms of character- 
istic values and the general procedure of Pélya and Wiener can be followed. 


(*) Actually Pélya and Wiener work with the operator D and the corresponding charac- 
teristic functions exp (nix). The “root consuming factor” in (1.3) is the square of their factor. 
The emphasis and terminology have been changed in order to bring out the generalizations. 


1942] ‘DIFFERENTIAL TRANSFORMS. II 465 


It should be observed, however, that the method is not constrained to the 
consideration of characteristic series the terms of which are defined by bound- 
ary value problems and consequently orthogonal in the basic interval. The 
case of almost periodic functions was mentioned by Pélya and Wiener and a 
non-orthogonal characteristic series figures also in §2.11 of the present pa- 
per(?). 

2. Arrangement of material. Chapter I is devoted to a study of oscilla- 
tion preserving transformations defined by linear second order differential 
operators. The basic definitions are found in $1.1 while 1.2 contains a number 
of lemmas of the classical Sturmian type which are needed for the discussion. 
In §1.3 the operators L are classified according to their behavior at the end 
points of the basic interval and to each of the four types considered we in- 
troduce function classes B™ the elements of which satisfy, together with their 
L-transforms of order less than k, the corresponding types of boundary con- 
ditions. That L—\, A>0, is oscillation preserving with respect to B is 
proved in §§1.4 to 1.7. Various extensions to functions of L are discussed in 
§1.8 and the corresponding boundary value problems are introduced in 1.9. 
We call attention to the singular and semi-singular types which appear to be 
new, though many of the most useful orthogonal systems considered in analy- 
sis appear as solutions of such boundary value problems. 

Chapter II brings the proof of the analogue of the Pélya-Wiener theorem 
on finite characteristic series. Here we place the discussion on a rather elab- 
orate postulational basis to make up for our lack of knowledge of the ex- 
istence and properties of solutions of the singular and semi-singular boundary 
value problems. We consider systems S consisting of an operator L, a set of 
characteristic functions {u,(x)} with corresponding characteristic values 
{un}, and a basic interval (a, 6). We call the system admissible if it satis- 
fies conditions A; to Ag of §2.1. These are conditions which are well known 
to hold in the case of classical boundary value problems but which, con- 
ceivably, may fail in the case of singular ones. We also consider the class F of 
admissible characteristic series }>fnua(x) such that }oy|fn| < © for all m. 
The convergence theory of such series is discussed in 2.2. The system S is 
called conservative if the set {un(x), un} belongs to an appropriate boundary 


(?) There are no general results available relating to oscillation problems for non-orthogonal 
characteristic series. Existing evidence, meager as it is, seems to indicate that the situation is 
similar to the orthogonal case. In other words, if the frequency of oscillation of L*f(x) is bounded 
or has a finite limit inferior, then the frequency of oscillation of the components of f(x) is simi- 
larly limited, the main difference being that we may now still have infinitely many components. 
“Characteristic integrals” can also be studied from this point of view by a suitable modification 
of the method. A first investigation of this type will be given by J. D. Tamarkin in a later paper 
in this series. The author wishes to use this opportunity to express his gratitude to his collabora- 
tors on the S. Bernstein problem, Professors G. Pélya, A. C. Schaeffer, G. Szegé, and J. D. 
Tamarkin, with whom he has had many profitable discussions of various points of his work 
during his stay at Stanford University. 


466 EINAR HILLE q [November 


value problem for L in (a, 6) and it is shown that the results of Chapter I 
apply to conservative systems. In particular, L—A, \>0, and any real poly- 
nomial in Z with real positive roots are oscillation preserving in (a, 0) with 
respect to the class F. This is proved in §2.3 where we also discuss the rela- 
tion between F and the classes B‘* introduced in 1.3. 

The main theorem is proved in 2.4. If S is conservative and f(x) EF, then 
the assumption that the inferior limit of the number of sign changes of L*f(x) 
in (a, b) is finite and equals N, implies that f(x) is a linear combination of a 
finite number of characteristic functions u,(x), none of which can have more 
than N (in an exceptional case possibly N+ 1) sign changes in (a, b). This 
is the analogue of Theorem I of Pélya and Wiener. In §§2.5 and 2.6 we verify 
that the classical boundary value problems lead to conservative systems. In §§2.7 
to 2.11 we give similar verifications for the systems of Legendre, Jacobi, 
Hermite, Weber-Hermite, and Laguerre, which correspond to singular bound- 
ary problems, and that of Bessel which is semi-singular. We call attention, 
in particular, to the characterization of ordinary polynomials by means of 
sign change properties given in Theorems 12, 13, and 14. It is analogous to 
the results of Pélya and Wiener for trigonometric polynomials quoted above. 

Extensions to the case in which NV; is unbounded are indicated briefly in 
§3.1 of the Appendix. The author has extended Theorem III of Pélya and 
Wiener under fairly general assumptions on the system, but the rather lengthy 
and complicated analysis is omitted here and the results are stated merely for 
the singular systems of §§2.7 to 2.10. It turns out that N;,=o0(k'/?) is again 
sufficient in order that the corresponding characteristic series shall converge 
in the finite complex plane and hence represent an entire function. The 
method of Szegé gives a better result, when it applies, which is to the Le- 
gendre and Jacobi cases. 


CHAPTER I. OSCILLATION PRESERVING TRANSFORMATIONS 


1.1. Preliminary notions and formulas. All functions considered in Chap- 
ters I and II are real functions of a real variable, defined in a finite or infinite 
interval (a, 6) and having certain properties of continuity in (a, b). Here 
(a, b) stands for one of the four alternatives (a, 5), (a, 6], [a, 6), and [a, 5]. 
The symbols C(a, 6), with k=0, positive integer or ~, refer to the usual 
continuity classes. Finally we denote the class of all functions real and holo- 
morphic in (a, b) by A (a, b). 

Let g(x) EC (a, b). We say that g(x) has N changes of sign in (a, b), if 
(a, b) breaks up into exactly N+-1 subintervals in each of which g(x) keeps a con- 
stant sign, the signs being opposite in adjacent intervals. The subintervals are in 
general not uniquely determined. The statement that the sign of g(x) in 
(x1, x2) is, for instance, positive is taken in the wide sense, that is, g(x) 20 
and actually is greater than 0 in some subinterval of (x1, x2). If g(x) is periodic 
of period b—a, this definition should be slightly modified. We map the in- 


1942] DIFFERENTIAL TRANSFORMS. II 467 


terval on the circumference of a circle, identifying the end points. Here N 
intervals of alternating signs determine N sign changes. It is clear that N 
- must be even in the periodic case. If there is no finite N with these properties, 
we say that g(x) has infinitely many sign changes in (a, 6). The number of 
sign changes of g(x) in (a, 6), finite or infinite, is denoted by V[g(x)]. The 
theorem of Rolle implies 


Lemna 1. If g(x) EC (x1, x2) and g(x)—+0 when and when x—>x2 but 
g(x) 40 im (x1, x2), then g'(x) has at least one sign change in (x1, x2). 


Let L denote the differential operator defined by 
(1.1.1) = po(x)y + pi(x)Dy + po(x)D*y, = d/dx, 


where to start with the coefficients will be subjected to the following two 
assumptions which will be held fast throughout the paper: 

Ai. Pm(x) EA (a, 6), m=0, 1, 2. 

Ac. po(x) po(x) >0 for a<x<b. 

For much of our work in §§1.2 to 1.7 it would be sufficient to assume 
merely pn(x)EC(a, b), but any consideration involving repeated applica- 
tion of the operator requires additional restrictions of pn(x), so we may just 
as well assume analyticity from the start(‘). 

The self-adjoint form of L is L* where 


(1.1.2) L*[y] = P(x)L[y] = D[P(x)po(x)Dy] + P(x) po(x)y, 
* 


1 
(1.1.3) P(x) = pla) exp { 


Here P(x) >0 for a<x<b. If pi(x)=0, we take P(x) =1/p2(x). 
Any solution of the differential equation 


(1.1.4) (L+u)y =0, 


» constant, will be referred to as a characteristic function of L corresponding 
to the characteristic value u. The reader should note that the terminology differs 
from that used in the Introduction according to which —y rather than yp 
would be called the characteristic value. The present convention is preferable 
when one works with second order linear differential equations. 

If f(x) EC(a, b), then L[f] has a sense and L[f]EC(a, b). The dif- 
ferential transform L[f] is the first L-transform of f(x). The higher L-trans- 
forms are defined by recurrence: 


(1.1.5) [pl], Lil =f. 


If f(x) EC? (a, b), then L*[f] exists and belongs to C\(a, 6). If convenient 
or desirable we drop the brackets or exhibit the variable. Thus L*f, L*f(x), 


(*) It should be observed in connection with A; that the theory goes through with only 
minor changes if po(x) has merely a finite upper bound in (a, 5). 


468 EINAR HILLE [November 


Lt [f(x)], L*[f] all have the same meaning. The reader should observe that 
the symbol L*f(x0), a<xo<b, denotes the value of L*[f] for x =x» and not 
the result of operating by L* on the constant f(xo). 


DEFINITION. Let F be a subclass of C‘(a, b) and let d be fixed real. Then 
L—d is said to be an oscillation preserving transformation in (a, b) with re- 
spect to Fif 


(1.1.6) VIZ — 2 
for every f(x) EF. 


It should be observed that there are always functions f(x) #0, satisfying 
(1.1.6). Thus if yu is fixed real and y(x, u) is any solution of (1.1.1), then 
(L—d) y(x, = so that (1.1.6) is trivially satisfied for every 
Ax¥—y. If V[f(x)]= ©, (1.1.6) is understood to mean merely that the left- 
hand side is also infinite. ‘ 

The basic formula in the discussion of the operator L —) is the factoriza- 
tion given by 


for which see L. Sclilesinger [3, vol. I, p. 52]. Here, Wi: is a solution of the 
associated differential equation (L—A) y=0, and W2 is the Wronskian of Wi 
and a second linearly independent solution of the same equation. It is per- 
mitted to assume that W; is real positive in (a, b). The crucial point in the 
use of formula (1.1.7) lies in the choice of W: which we refer to as the auxiliary 
solution. 

1.2. General properties of the auxiliary solution. We proceed to a discus- 
sion of the solutions of the associated differential equation(°) 


(1.2.1) (L — = 0, 0, 
in the interval (a, 5). Introducing 

(1.2.2) K(x) = P(x)pa(x), G(x, = P(x) [A — po(z)], 

we can rewrite the equation in the form 

(1.2.3) D[K(x)Dy] — G(x, )y = 0. 


Under the assumption As, K(x) and G(x, \) for \>0 are positive in (a, 5). 
Two integrated forms of the equation will be useful in the following. First 
we have obviously 


(®) The discussion follows the classical Sturmian pattern, but at least some of the required 
results do not appear to be available in a convenient form in the literature. The proofs are held 
down toa minimum. 


| 


1942] DIFFERENTIAL TRANSFORMS. II 


(1.2.4) d) y(é)dt. 


Secondly, multiplying (1.2.3) by y and integrating we get 


We conclude from (1.2.5) that if y(x) #0 is a solution of (1.2.1) in (a, d), 
then the product y(x) y’(x) can vanish at most once in the interval. Hence 
the real solutions of (1.2.1) are of the following four types in (a, b): (1) mono- 
tone solutions of constant sign, (2) solutions of constant sign having a maxi- 
mum, (3) solutions of constant sign having a minimum, and (4) monotone 
solutions having a zero. These types are mutually exclusive and exhaust the 
possibilities. The fourth type is of no interest to us in the following and will 
be omitted from consideration. 

The existence of unbounded solutions is vital in most of our discussion. 
We introduce the following notation: 


U(x, x0) Q(x, x0; d) d)do, 


%o; d) 

Lemma 2. Let y(x) be the solution of (1.2.1) determined by the initial condi- 
tions y(xo) =1, y'(xo) =s 20, a<xo<b. A necessary and sufficient condition that 
y(x)— © when x—>b is that R(x, x9; \) > © when x—>b. If the latter condition is 
satisfied for a particular choice of x» and d, then it holds for every xo, a<x9 <b, 
and every \>0. Moreover, if the condition holds, every solution of (1.2.1) such 
that y(x)y'(x) is ultimately positive for approach to b becomes infinite when 
x—b. Similarly, if R(x, xo; \)—0© when xa, then every solution with 
y(x)y'(x) ultimately negative for approach to a becomes infinite when x—a. 


(1.2.6) 
R(x, d) = 


It is clear from the structure of R(x, xo; A) that the condition is independ- 
ent of x9 and X. We shall prove the lemma for fixed x» and \ and consider only 
the case x»). The same method applies at the other end point. The lemma is 
an immediate consequence of 


LEMMA 3. Under the assumptions of Lemma 2(°) 
(1.2.7) S(x, x0; 4) < y(x) < exp {S(x, X03 r)} 


(*) The inequality (1.2.7) does not give very precise information regarding the rate of 
growth of »(x), but in a certain sense it is the best of its kind. The ratio y(x)/S(x, xo; A) is 
bounded in the case of the Legendre operator L =(1—x*)D*—2xD, a= —1, b=1, for approach 
to the singular end points, while y(x) exp [—S(x, xo; \)] is bounded away from zero in the case 
of the Hermite operator L = D*—2xD, a= — ©, b= «, See §§2.7 and 2.9 below. 


469 


470 EINAR HILLE ~ [November 


for xo Sx <b, where 
(1.2.8) S(x, = R(x, x0; %) + K(x0)y’(x0)U(x, x0). 


Putting x: =x» and x,:=4 in (1.2.4) and noting that y(x) is increasing and 
greater than 1 in the interval (xo, b), we get 


K(u)y'(u) > K(xo)y'(x0) + Q(u, xo; d). 


Dividing by K(u) and integrating from x» to x, we obtain the first half of 
(1.2.7). But we have obviously also 


K(u)y'(u) < K(x0)y'(%0) + Q(u, x0; d)y(u). 


Dividing by K(u)y(u), dropping y(u)>1 in the first denominator on the 
right, and then integrating from xo to x, we get the second half of the inequal- 
ity. 

This shows that y(x) becomes infinite when x— 0d if and only if S(x, x9; A) 
has the same property. But both terms on the right in (1.2.8) are positive 
and a simple calculation shows that U(x, x9)—© when x-—d implies 
R(x, x0; 4) ©, but not vice versa. This completes the proof of the lemmas. 

These inequalities have a relation to the transformation theory of the 
differential equation which is of some interest for the following. If we intro- 
duce a new independent variable in (1.2.1) by putting u = U(x, xo) and define 
y(x) = Y(u), then the transformed differential equation is simply 


@Y @R 
(1.2.9) 


where under our assumptions d*R/du?>0 in the interval (A, B) which is the 
image of (a, 6) under the transformation. If, for instance, B= ©, then it is 
perfectly trivial that every solution of (1.2.9), which is not positive monotone 
decreasing in (A, B), becomes infinite with u. This transformation will be 
useful in the proof of the next lemma which is a comparison theorem of the 
classical Sturmian type. 


LemMA 4. Let y(x; xo, s, dX) be the solution of (L—X) y=0, y(xo) =1, 
y’ (xo) =s20. (1) For fixed x, xo, and s, x9<x, y(x; Xo, S, ) is an increasing 
function of X. (2) For fixed i, the ratio of y(x; x1, 0, dX) to y(x; x2, 0, d), 
a<x1<x2<b, lies between finite positive bounds depending upon x; and x2 but 
not upon x, a<x<b. (3) For fixed xo and X, the ratio of y(x; xo, s1, d) to 
y(x; Xo, Se, A), OS51<5S2, lies between finite positive bounds depending upon s2 
but not upon x, x»Sx<b. 


The first statement follows directly from the formula 


K(2)[y,(2) (2) — (2)] = f PO 


1942] DIFFERENTIAL TRANSFORMS. II 471 


with obvious notation. The second assertion lies slightly deeper, but follows 
from the expression for the Wronskian of two solutions yi(x) and ye(x) of the 
equation. Taking yi(x) =y(x; x1, 0, A), yo(x) =y(x; x2, 0, A), we get 


K()W2(y1(2), ya(f)) = (m1) = —C <0. 


Dividing through by [y2(t) ]?K(¢) and integrating from x, to x where x2<x, we 
obtain 


1 * dt 
= 


so the statement is proved for such values of x if we can show the conver- 
gence of the integral when x—>d. This is trivial if the integral obtained by sup- 
pressing the factor [~y2(t)]? in the denominator is convergent. Hence we can 
assume that the function U(x, xo) of formula (1.2.6) tends to infinity when 
x—b. Putting u = U(x, x:) and transforming the differential equation upon the 
form (1.2.9) we get 


f ” dt f « dv 

Jo 

with obvious notation. But Y2(v) is positive, concave upwards for v>0, and 
tends to infinity with v. Hence we can find a linear function av+ 8 with a>0 
such that Y2(v) >av+ 8 for v>v9. This proves the convergence of the integral 
and gives a finite upper bound for the ratio in the interval (x2, b) where the 
lower bound is unity. In the interval (a, x:) we simply interchange (x) and 
y2(x) and apply the same method. The interval (x1, x2) is trivial. This com- 
pletes the proof of (2). The same method can be used in proving (3). 

1.3. Boundary conditions. We shall make no attempt to determine the 
maximal class with respect to which the operator L —) is oscillation preserving 
in (a, b). It is likely to be a complicated and none too interesting problem. 
We shall instead specialize L in various ways and determine certain associated 
classes of functions by means of appropriate boundary conditions. We shall 
consider four alternatives which by no means exhaust the field but which at 
least cover a large number of cases of well established interest. 

Tx. Sturm-Liouville type. pn(x)€A [a, b], m=0, 1, 2; p2(a) ¥0, p2(b) ¥0. 

Tz. Periodic type. Assumptions as under 7, but in addition, K(a) =K(b), 
that is, pilt)/po(t) } dt=0. 

Ts. Singular type. R(x, x9; \)-> © when x—a and when x—>b for some xo, 
a<xo<b, and X>0. 

Ts. Semi-singular type. 6], m=0, 1, 2; p2(b)¥0; and 
R(x, b; © when x—a, d>0. 

In T; and T, the functions R(x, x9; A) and R(x, b; X) are defined by (1.2.6). 
Lemma 2 shows that the value of x» is immaterial and that the condition 
holds for all X>0 if it holds for a single one. In T, the roles of a and 5 can of 
course be interchanged. 


| 
| 
| 


472 EINAR HILLE [November 


With each operator L of type T, we associate a set of classes B® { L; (a, b)} 
of functions f(x) satisfying appropriate boundary conditions. Here k is a posi- 
tive integer or infinity. 

Bi. BY? L; [a, b] } Cc) [a, 6]; L*f(a) =0, L*f(b) =0,n=0,1,---,k—-1. 

Bs BY {L; [a, b]} CC? [a, b]; L*f(a) =L*f(b), m=0, 1,---, k—1,k; 
DL*"f(a) =DL"f(b), 

Bs. By {L; (a, 6)} CC*(a, b); [L*f(x)]/y(x; xo, when and 
when x—>b forn=0,1,---,k—1, for some xo, a<xo9<b, and arbitrarily small 
positive 

B, BY {L; (a, =BY {L;C,, Ca;(a,0]} CC (a, [L-f(x) 00 
when x—a for arbitrarily small positive h; CiL"f(b)+C:DL"f(b) =0, Ci, C2 
fixed greater than or equal to 0, both conditions holding for n=0,1,+--+,k—1. 

Here L*f(a) and DL"f(a) are the values of L*f(x) and DL*f(x) at x=a. 
In Bs, y(x; xo, A) =y(x; Xo, 0, A) in the notation of Lemma 4, similarly in B, 
where x)=). By virtue of Lemma 4 we should expect that the value of xp is 
immaterial and that small positive values of \ are the decisive ones. It is 
perfectly obvious that we could consider other classes of functions in connec- 
tion with these operators. In particular, more general boundary conditions 
could be allowed at one end point in B,. If a and 6 are interchanged in By, 
the sign of C2 should also be changed. We merely mention these possibilities. 
Our main object in Chapter I will be to study the four listed types in some de- 
tail and to prove Theorem 1 and its various extensions. 


THEOREM 1. If the operator L is of type T, and X>0, then for every 
f(x)EB {L; (a, b)}, v=1, 2, 3, 4, we have(?) V[(L—d) f(x) ]= (x) ], that 
is, L—d ts oscillation preserving in (a, b) with respect to the corresponding class 
B® {L; (a, b)}. 


1.4. Discussion of the Sturm-Liouville case. This case is readily recog- 
nized and the proof of Theorem 1 is quite simple. We choose W:=~y/(x, \) in 
(1.1.7) as the solution of the initial value problem (L—d) y=0, y(b, A) =1, 
y’(b, \) =0. Formula (1.2.5) shows that y(x, \) >1 in (a, d). 

The theorem is trivial if V[f(x) ] = ©. Suppose then that V[f(x)]=N< 0. 
We can then find N+2 points x; where a=x,;<x2< +++ <xw41<x%Nw42=), 
such that f(x;) =0 and f(x) is not identically zero in anyone of the intervals 
(xj, 241). Since y(x, 4) 21, Lemma 1 shows that D[f(x)/y(x, \)] has at least 
one sign change in each of the N-+1 intervals (x;, xj41). Multiplication by the 
positive bounded factor [y(x, \) ]?/W2 does not change this situation and the 
derivative of the result by Lemma 1 has at least N sign changes in (a, 5). 
Hence V[(L—2) f(x)]2N and the theorem is proved. 

If the boundary conditions in B; for k=1 be modified so that f(a) =0 is 
replaced by the condition Cif(a) — Cof’(a) =0, C:20, C.>0, while the condi- 


(7) V[g] is to be computed according to the definition for periodic functions when »=2 
but according to the main definition in the other cases. See §1.1, second paragraph. 


1942] DIFFERENTIAL TRANSFORMS. II 473 


tion f(b) =0 is left intact, the proof can still be carried through, but the choice 
of y(x, \) has to be modified accordingly. To each f(x) of the class we deter- 
mine a corresponding y(x, \) by the condition that it should have the same 
logarithmic derivative as f(x) at x =a. Taking y(a, \)=1 as is permissible, 
we still have y(x, A) >1 in (a, b). Then D[f(x)/y(x, d)] will be zero at x =a 
instead of in the interior of (a, x1). it consequently still has N+1 zeros in 
(a, b) and does not vanish identically between any consecutive pair of zeros. 
Thus (L—A)) f(x) has N sign changes at least, and the theorem is proved un- 
der the more general assumptions. The restriction imposed on the sign of the 
logarithmic derivative of f(x) at x =a is dictated solely by our concern that 
the corresponding (x, \) shall be positive in [a, b]. If this condition is known 
to be satisfied, the restriction can be dropped. It is clear that modifying the 
boundary conditions at both end points meets with additional difficulties and 
this problem will not be considered here. It should be observed, however, that 
the case f’(a) =0, f’(b) =0, can be handled without difficulty. 

1.5. Discussion of the periodic case. The name periodic case is to some 
extent a misnomer, but it is a customary designation for the corresponding 
type of boundary conditions and the case has close relations to periodicity 
in the usual sense. Moreover, it includes as a special instance the case in which 
K(x) and G(x, d) are periodic with period (b—a). 

If f(x) {L; [a, b]}, then f(x) [a, b], f(a) f’(a) =f"(b), and 
Lf(a) =Lf(b). We can then find a function f*(x)EC”(— «©, ©) such that 
f*(x+b—a) =f*(x) and f*(x) =f(x) in [a, b]. The second derivative of f*(x) is 
continuous everywhere with the possible exception of x=a (mod (b—a)) 
where, however, right- and left-hand derivatives exist. Similarly Lf(x) can 
be extended periodically as a continuous function and the extension agrees 
with Lf*(x). 

The definition of V[g(x)] given in §1.1 varied slightly according as g(x) 
was defined only in [a, b] or could be extended periodically as a continuous 
function with period (6—a) outside of this interval. In the latter case the 
definition was such that the number of sign changes in the period would be 
independent of the choice of the end points. Actually the two definitions are 
always in agreement except in the case in which g(a) =g(b) =0 and g(x) has 
an odd number, say 2K —1, sign changes in the interior of the interval. In 
this case one definition would give V[g(x)]=2K—1 and the other 2K, the 
zero at x =a being counted as an additional sign change in the definition for 
periodic functions. 

We now agree that if »=2 the definition for periodic functions shall be 
used in interpeting the V-symbols in Theorem 1. In other words, the inequal-' 
ity to be proved is actually 


(1.5.1) V[(L — »f*(x)] = 


In the subsequent proof V[g] refers to the non-periodic and V[g*] to the 


4 
‘ 
{ 
j 
| 


474 EINAR HILLE [November 


periodic definition. The reader should note that V[g]< V[g*] s V[g]+1 and 
V [g*] is always an even number. 

For the proof we have to distinguish several subcases. Suppose first that 
f(a) =f(b) =0 but V[f(x)]=V[f*(x)]=2K. The proof given in §1.4 applies 
without any change and gives V[(L—X)f(x)]22K which in turn implies 
(1.5.1). 

Suppose next that f(a) =f(b) =O and V[f(x)]=2K—1, V[f*(x)]=2K. We 
choose the same auxiliary solution y(x, \) as in the preceding case. By Lemma 
1, D[f(x)/y(x, d)] has at least 2K sign changes in (a, 5). It follows that 
V[(L—d)f(x)]22K—1. Hence and (1.5.1) follows. 

Suppose finally that f(a) =f(b) >0 and V [f(x) ] = V[f*(x) ] =2K. Iff’(a) 20 
we determine y(x, X) by the initial conditions y(a, \) =1, y’(a, A) =f’ (a) /f(a). 
If f’(a) =f'(b) <0, we take instead y(b, A)=1, y’(b, A) =f’(b)/f(b). In either 
case y(x,A)21 in [a, b]. Then 


d f(x) 
[y(x, 
has at least 2K —1 sign changes in (a, b) and, in addition, vanishes at x =a 
or x=b depending upon the sign of f’(a). It follows that V[(L—d)f(x)] 
=2K-—1 and V[(L—X)f*(x)]22K. This completes the proof of Theorem 1 
in the periodic case. 

The proof is modelled upon that given by Pélya and Wiener for the case 
L=D', 

1.6. Discussion of the singular case. This case is characterized by the 
presence of singular points of the differential equation at x =a and x =), suffi- 
ciently severe to cause the critical function R(x, xo; 4) to become infinite for 
The class Bf {L; (a, b)} consists of all functions f(x) EC(a, b) such 
that f(x) /y(x; xo, when x—a and when for arbitrarily small posi- 
tive values of X. Here y(x; xo, X) is determined by (L—A)y=0, y(xo) =1, 
y’(xo) =0, a<xo<b. Let us first assess the influence of x) and \ upon the de- 
termination of the class Bf” {L; (a, b)}. 

Suppose that x» and } are fixed and suppose f(x) EC (a, b), f(x) /y(x; xo, A) 
—0, xa, b. Denote the class of all such functions for the moment by 
F(A, x9). By Lemma 4 the ratio of y(x; x1, A) to y(x; x2, \) is bounded away 
from zero and infinity in (a, 5). It follows that f(x)G F(A, x) implies 
f(x) E F(A, x2) and vice versa so that F(X, xo) is independent of x» and can 
be written simply F(\). Lemma 4 also asserts that y(x; xo, \) is an increasing 
function of in x» Sx <b. But in our case s =0 so that the argument given in 
Lemma 4, part (1), applies also to the interval (a, x9). Hence f(x) F (1) 
implies f(x) € (Ag) for In other words (Ax) C F (As) when Ai <Az. 
Thus the cross section of all classes F‘(A) with \>0 exists and equals 
lim,.oF™ (A) = F(+0). We can define in the same manner classes F‘*)()) 
consisting of all functions f(x) of C‘?"(a, b) for which L*f(x)/y(x; xo, 4)—0, 


i 
> 


1942] DIFFERENTIAL TRANSFORMS. II 475 


x—a, b,n=0,1,---,k—1, as well as their cross section for \>0, F“(+0) 
F(A). 

Since y(x; xo, 0) is well defined, so is the class F“(0) and it is clear that 
F(0)C F (+0). Ordinarily these sets do not coincide because the sets 
F®(\) are as a rule not continuous in \. Even if they are continuous for \>0, 
they may very well lose this property for \=0. Simple examples can be given 
for both possibilities. 

If po(x) =0, y(x; xo, 0) =1 and there is no auxiliary solution (of constant 
sign) which becomes infinite at both end points for \=0. In this case F“ (0) 
reduces simply to the subset of C‘?")(a, b) the elements of which satisfy the 
boundary conditions L"f(x)-—0, xa, b, n=0,1, - - - ,&—1. It is obvious that 
this set is a subset of every class F(A), A>0. If po(x) and R(x, x9;0)&, 
x—a, b, then F“(0) certainly contains elements which do not vanish on the 
boundary together with their L-transforms of order at most k—1. 

These results allow us to formulate 


Tueorem 2. BY {L; (a, b)} = F(+0)D F(0). 


The proof of Theorem 1 in the singular case can be given in a few lines. 
We choose Wi=+(x; xo, \) and proceed as in the Sturm-Liouville case, the 
only difference being that the points x; now figure as zeros of the continuous 
function f(x)/y(x; xo, \) rather than of f(x) which of course supplies the sign 
changes in (a, b). Lemma 1 applies as before and gives V[(L—X) f(x) ]2 N(*). 


1.7. Discussion of the semi-singular case. The discussion follows the same 
general pattern as in the singular case. The class B{?{L; Ci, C2; (a, 6]} is 
defined as that subclass of C‘”(a, b] the elements of which satisfy at the 
singular end point x =a the condition f(x) /y(x; b, for arbitrarily small 
X>0, while at the regular end point x =b we have Cif(b)+(C.f’(b) =0 where 
C120, C220, Ci+C2>0 are fixed. The auxiliary solution y(x; 5, ) satisfies 
the initial conditions y(b) =1, y’(b) =0. 

Let us denote by G‘(Ci, C2; X) the class of functions in C(a, b] which 
satisfy these boundary conditions for a fixed \20. Lemma 4 ensures that the 
ratio of y(x; b, 0, 4) =y(x; b, A) to (x; b, —s, d) for a fixed positive s is 
bounded away from zero and infinity in (a, 6](*). Hence we have also 
S(x)/y(x; b, —s, X)-90, for any fixed s>0, if f(x)EG(Ci, Ca; 2). 
As in §1.6 we show that GY(Cy, Ca; 1) CG™ (Ci, Co; Az) for We 
find that G(C,, C2; +0) =limy.o (Ci, C2; is the cross section of all 
classes G“)(C;, C2; \) for A>0. In a similar manner we define classes 
G (Cy, C2;d) and G (Ci, C2; +0) =limy.o G (Ci, Co; Here G (Ci, Ca; d) 
is simply that subclass of C‘?")(a, b] consisting of functions f(x) such that 


(8) The same argument applies in case either end point should be regular or the condition 
R(x, x9; })—+ © should fail to hold, provided f(x) be constrained to vanish at the end point in 
question. Various intermediary types of operators are covered by this remark. 

(*) A change of variable, replacing x by —x, reduces the discussion to the case considered 
in Lemma 4, 


476 EINAR HILLE _ [November 


f(x), Lf(x), +--+, L*-¥f(x) all belong to G“(C,, Cr; 4). We have obviously 
G(Ci, C2; 0) CG(Ci, C2; +0). We can sum up the result in 


3. BY {L; Ci, Cs; (a, 6] } =G (Ci, Cr; +0). 


The proof of Theorem 1 in the semi-singular case follows the same lines 
as in the preceding cases. Suppose that f(x) € BS” {L; Ci, Ca; (a, b]}. If C.=0, 
that is, if f(b) =0, we choose y(x; \) = y(x; 5, X) and proceed as in the singular 
case. If C.#0, we sets =Ci/C.= —f'(b)/f(b) and take y(x, A) =y(x; b, —s, d). 
Putting g(x) =f(x)/y(x, we see that g(x)—>0 when x— a and g’(b) =0 since 
numerator and denominator of the fraction have the same logarithmic de- 
rivatives at x =b. The proof then proceeds as in the Sturm-Liouville case with 
generalized boundary conditions. 

1.8. Extensions. The case \=0 figured briefly in §1.6. It is of some inter- 
est to determine function classes for which the operator L itself is oscillation 
preserving. We arrive at the following result for the proof of which the reader 
will find the necessary material in the preceding sections. 


THEOREM 4. If the operator L is of type T,, there exists a class F, with re- 
spect to which L is oscillation preserving in (a, b). If v=1 or 2 we have 
F,D BY {L; [a, 6]}, while F(0) and FkDG™(Cy, C2; 0). 


In the remainder of the paper we shall have to apply a given operator L 
more than once to the functions under consideration. Here is where the classes 
BY {L; (a, b)} with >1 are required. We note that if f(x) EB {L; (a, b)} 
and \>0 then (L—A) f(x) EBY- {L; (a, b)}. Repeated application of Theo- 
rem 1 leads to the following result. 


THEOREM 5. Let II,(u) be a polynomial in u of degree k, having real coeffi - 
cients and real positive zeros. If L is of type T,, then I1,(L) is an oscillation 
preserving transformation in (a, b) with respect to the corresponding class 
B® {L; (a, b)}. 


In particular, we can always allow the class BS*’{ L; (a, b) }. It is obvious 
that B® D>B*+>B) and it can be shown that B‘*) is never vacuous(!*), 

By virtue of Theorem 4 we can also allow the root u=0 with arbitrary 
multiplicity, in cases T; and T: without restriction of the class and in cases 
Ts and T, at least for the corresponding classes F‘(0) and G“(C,, C2; 0). 
We can also extend in a different direction. We can allow operators of the 
form E(L) where E(u) is a suitably restricted entire function, provided we 


(°) For »=1 and 2, this follows from Theorems 7, 10, and 11 below. For »=3 and 4 the 
statement is also obvious whenever the corresponding boundary value problems P; and P, 
of §1.9 have solutions. In more general cases, the following type of argument leads to functions 
having the desired properties. Suppose » =3, a and b finite and at most poles of the coefficients. 
Then we can take any function of the form exp [—A(x—a)*—B(x—b)], A>0, B>0. The 
modifications necessary in case a or 6 or both are infinite are obvious. Heavier singularities can 
be handled by stepping up on the exponential scale. The same type of functions will do for »=4. 


1942] DIFFERENTIAL TRANSFORMS. II 477 


also restrict f(x) to be analytic. The result, being of no importance for the 
following, is stated without proof. 


THEOREM 6. Let E(u) be an entire function of order 1/2 and minimal 
type(*"), having real coefficients and real positive zeros. Let L be of type T,. 
Let A,{L; (a, b)} be obtained from BS"){L; (a, b)} by replacing the requirement 
f(x)EC™ (a, b) by f(x) EA (a, b). Then E(L) is an oscillation preserving trans- 
formation in (a, b) with respect to the class A,{L; (a, b)}. 


What was said above regarding the root u=0 applies also, mutatis mu- 
tandis, to the case of an entire function. 

1.9. The associated boundary value problems. With each operator L of 
type T, there is an associated boundary value problem. We refer to the ques- 
tion of determining characteristic functions and characteristic values of the 
problem 


(1.9.1) (L+uju=0, u(x) EB, {L; (a, 


Thanks to the analyticity assumptions for the coefficients of L any solution 
must also have the property u(x) €A,{L; (a, b)}. For the sake of clarity, we 
write out in full the four problems. 

Py. (L+y)u=0, u(a) =0, =0. 

Py. (L+pu)u=0, u(a) =u(b), u’(a) =u'(d). 

P;. (L+u)u=0, u(x)/y(x; x0, xa, for every X>0. 

Py. (L+pu)u=0, u(x)/y(x; Ciu(b)+Cwu’(b)=0, Ci20, 

The problems P; and Pz are classical boundary value problems of the 
Sturm-Liouville and periodic types, respectively. It is well known that these 
problems have solutions and the reader will find a short summary of the avail- 
able information concerning the properties of the solutions, to the extent that 
is needed for our purposes, in §§2.5 and 2.6 below. 

Boundary value problems of types P; and P, do not seem to have been dis- 
cussed in the literature though a number of the best known special orthogonal 
systems used in analysis can be obtained as solutions of such problems. This 
is not the right place to develop a general theory of problems P; and Py. We 
restrict ourselves here to pointing out the existence of the problems and will 
call attention to the special instances as they are encountered in Chapter II. 

In the case of problems P; and P; there is in existence a well developed ex- 
pansion theory. Thus, for instance, every function f(x) E Bi {L; [a, b]} can 
be represented by a uniformly convergent series in terms of characteristic 
functions of P;. The same is true in the case of Ps. It is natural to expect that 


(#) The statement means that E(u) exp (—e| u|¥)—0 when | x| — for every e>0. It 
would be more precise to say that the order is at most 1/2 and if it equals 1/2, then the function 
is of minimal type. 


478 EINAR HILLE — [November 


a similar situation holds under fairly general circ::mstances also in the case of 
P; and Py. A number of special instances are well known. 


CHAPTER II. FINITE CHARACTERISTIC SERIES 


2.1. Admissible systems. In this chapter we shall start the study of the 
relationship between the infinitary behavior of the sequence V[L*f(x)] and 
the analytical nature of f(x). This will be carried out under rather severe re- 
strictions on L and on f(x). In part the restrictions are dictated by the nature 
of the problem, but they are also due to our lack of knowledge regarding the 
boundary problems P; and P, defined in §1.9. This makes it necessary for us 
to postulate the existence of a solution of the boundary problems involved 
with fairly regular properties of characteristic values and functions. 

We consider first a system S=S tk. Un(x), Mn; (a, b)} consisting of an 
operator L, a set of characteristic functions {,(x)} and corresponding char- 
acteristic values {un}, the interval being (a, 5). We say that S is admissible 
if it satisfies the assumptions A; to Ag below and L is of one of the types T, de- 
fined in §1.3. 

Ai. pPn(x) EA (a, b), m=0, 1, 2. 

Ac. po(x) po(x) >0, for a<x<b. 

As. The functions { [P(x)]"/*un(x)} form a real orthonormal system, com- 
pletein L2(a,b). 

Ag. 0<pn The series is convergent for some a>0. 

As. There exist constants B and y and a non-negative function U(x) 
EC (a, b), such that 


As. For every fixed interval (c,d), aSc<d3b, Z,(c, d), the number of zeros 
of u»(x) in (c, d), tends to infinity with n. Z,(a, b) is finite and a never decreasing 
function of n(2?). 

A number of admissible systems occurring in classical analysis will be ex- 
hibited in §§2.5 to 2.11 below. 

We also consider a set F= F{L, Un(x), bn; (a, b)} of characteristic series 


(2.1.1) 


This set will be called admissible if S is admissible and 
Ci. fn is a real for all n, 
Ce. <0, R=0,1,2,---. 
This condition is obviously equivalent to the convergence of 


(#) We have Z,(a, 6) = V[ua(x)] except possibly in the periodic case when we may have 
Z,(@, 6) +1 = 


1942] DIFFERENTIAL TRANSFORMS. II 
m,.2 
fa) 
n=l 
for every integral value of m. In other words, the series 
(2.1.2) (<< Mn) ™fnten(X) 
n=l 


represents a function f,,(x) such that 
[P(x) & La(a, 6) 


for m=0,1,2,---. Itis clear that f,,(x) is obtained from f(x) =fo(x) by term- 
wise operation with L in the series (2.1.1). A characteristic series is admissible 
af its coefficients satisfy C, and Cz and S is admissible. Such a series defines an 
admissible function. 

2.2. Convergence in F. An admissible series converges not merely in 
weighted mean square but also in the local sense. 


Lema 5. If f(x) € F, then the series 


2 b 


converge absolutely and uniformly in every fixed interval (a, a<a1<b, <b, 
their sums being f(x) and f'(x), respectively. If the function U(x) of As can be 
taken equal to a constant, the convergence is uniform in |a, 6]. 


The convergence properties follow from assumptions A, and C2. The first 
series being convergent in (a1, b:) both uniformly and in weighted mean 
square, we conclude that the uniform limit is equivalent to f(x) and can be 
taken as the definition of f(x) for all x. The sum of the uniformly convergent 
derived series is then obviously f’(x). 


Lemna 6. If f(x) EF, so does L[ f(x) ] and 
(2.2.1) = — 


For the proof we observe that the second derived series of f(x) also con- 
verges absolutely and uniformly in (a;, 51) and hence has the sum f’’(x). This 
follows from the identity 


k k 
p2(x) fain (2) = — fates (x) 


480 EINAR HILLE 


Hence L [f(x) ] exists and 


LIf(z)] = x dnfattn(#) = fal) 


is an element of F. 

It follows that all functions L™[f(x) ] = Fal) exist and belong to F when- 
ever f(x) does. Thus we can apply the operation L as often as we please term- 
wise to an admissible characteristic series and the result will stay in F. Such 
a series can also be differentiated termwise arbitrarily often, but it is not a 
priori obvious that the result is always in F, though this appears to be true 
in simple cases. 

The class F could evidently be characterized by descriptive properties. 
Its elements are real in (a, b) and [P(x) ]**f(x) EL.(a, 6). F is invariant under 
the operation L. It is a linear vector space with real multipliers and contains 
the basis {u,(x)}. However, for our purposes it is simpler and more natural 
to start from the characteristic series. 

2.3. Conservative systems. We need a couple of additional assumptions 
linking the classes A,{L; (a, 6)} of Theorem 6 with the systems S and F. 
They read as follows. 

D,. If L is of type T,, then un(x)EA,{L; (a, b)} for all n. 

Es. If v=3 there exists a finite positive C(xo, ) ‘such that U(x)s 
C(xo, A)y(x; x0, A) for a<x<b, X>0. 

Ey. If v=4 there exists a finite positive C(d) such that U(x) SC(A) y(x; 5, dr) 
fora<x<b,r>0. 

It is worth while stating explicitly what D, amounts to in the various 
cases. Since u,(x) is a characteristic function of ZL, the denumerable infinity 
of boundary conditions entering into the definition of A,{L; (a, b)} reduces 
to a single pair. We get: 

Di. un(a) =0, =0. 

Un(a) = un(d), Un (a) = =Un (0). 

Ds. un(x)/y(x; x0, A)—0, xa, 5, for all X>0. 

Dg. un(x)/y(x; 6, A)-0, xa, for all X>0, and Cyu,(b) + (b) =0. 

In other words, D, asserts the existence of a solution of the corresponding 
boundary value problem P, of $1.9 and that this solution is given by 
{ n(x), bn}. If y=1 or 2, the function U(x) of As can be taken equal to a 
constant. This explains the absence of any conditions E; and E. In the two re- 
maining cases we need an inequality between U(x) and the auxiliary solution 
which is supplied by Esand Ey. - 


DEFINITION. An admissible system S is called conservative if it satisfies the 
conditions D, and E, corresponding to its type T,. 


Thus a conservative system satisfies conditions A; to Ag, one of the condi- 
tions T,, y=1, 2, 3 or 4, and the corresponding conditions D, and E,. 


[November 


1942] DIFFERENTIAL TRANSFORMS. II - 481 


TueoreM 7. If S=S{L, tn(x), un; (a, b)} of type T, is conservative and 
F=F I L, tn(x), bn; (a, 5) } is the corresponding admissible set of functions, then 
FCBS*){L; (a, b)}. If v=1 or 2, 


Suppose first that v= 1 and f(x) F. We can then find a constant U; such 
that | | Ui, asx 3b, for all n(**). Formula (1.2.4) with \= —y, shows 
that |,/(x)| Su,U2 for a suitably chosen constant Us. Thus we can take 
U(x) = U=max (Ui, U2), B=0, y=1 in As. By Lemma 5 the characteristic 
series of f(x) converges uniformly in [a, b]. Since every partial sum of the 
series vanishes for x =a and x=b by Di, we have f(a) =f(b) =0. Further the 
first derived series converges uniformly in [a, b] so that f’(x) is also continu- 
ous in [a, 6]. By Lemma 6, L*f(x) € F for all k. This means that f(x) has de- 
rivatives of all orders continuous in [a, b] and L*f(a) =L*f(b) =0 for all k. 
Hence f(x) {L; [a, 

Suppose, conversely, that f(x)EBS{L; [a, b]}. This means that 
f(x) EC™ [a, b] and L*f(a) = L*f(b) =0 for all k. Since L *f(x) [a, b] and 
vanishes at the end points, we have 


n=l 


‘ uniformly convergent in [a, 6]. But here we can use the classical identity of 
Lagrange (for the notation, see formulas (1.1.2) and (1.2.2)): 


gL*[h] — hL*[g] = D{K(gh’ — hg’)}. 
If g and k belong to B{*{ L; [a, b]}, integration from a to b gives 


b 
f =f mor teola 


b b 


and by iteration 


b b 
=f mo lar 


for every integer R20. Putting in particular g(x) =u,(x), h(x) =f(x) we get 
fnz=(—bn)*fn. Since fn is real and >\f2, converges for every k, we see that 
the coefficients f,, satisfy conditions C; and C;. Hence f(x) € F and the theorem 
is proved for v=1. 

The same type of argument applies if »=2, where of course periodicity 
plays the same role as vanishing on the boundary did when v=1. 


(*) This follows from property (3) of §2.5. 


or 


482 EINAR HILLE . [November 


Suppose now that v=3 and that f(x)EF. In order to prove that 
f(x) EBS” {L; (a, b)} it is enough to show that f(x)/y(x; x0, A)0, xa, b, 
for every \>0; the L-transforms of f(x) will then automatically satisfy the 
same conditions. But using A; and E; we have 


f(x) fa Un (x) 


y(x; Xo, 1 x0, d) N+1 y(x; Xo, d) 


< C(x0,) fa| 
N+1 
By C2 we can choose WN so large that the last member is less than any preas- 
signed ¢, and by Ds the finite series in the first member tends to zero when 
x—a. This completes the proof. 

Suppose finally y=4 and f(x) € F. Since y(x; 6, \) is bounded in [a+6, d], 
5>0, for fixed X, assumption E, shows that U(x) is bounded in [a+4, b]. 
Hence by Lemma 5 the series for f(x) and f’(x) are uniformly convergent in 
[a+4, 6]. The partial sums satisfy the boundary condition Ci S,(b) +C2Sz (6) 
=0 for all ». Hence we have also Cif(b)+C2f’(b) =0 and the same boundary 
condition is satisfied by L*f(x) for all k. The proof that f(x)/y(x; b, A)-0 
when x—a for every \>0 goes through as when v=3. 

We cannot assert that F=B‘*) when vy=3 or 4. The following example 
disproves such a.conjecture. We take for L the Hermite-Weber operator 
D?—x?, a=— ©, b= ~~; {u,(x)} is the set obtained by orthogonalizing and 
normalizing the Herusie polynomials and pu, = 2n+1. It is shown in §2.9 that 
this system is conservative and of type Ts. If f(x) =1 then L*f(x) is an eyen 
polynomial of degree 2k. Referring to formula (2.9.3) which gives the asymp- 
totic behavior of y(x; 0, A) for large x, we see that f(x)=1 belongs to 
BS {L;(—, )}, but it does not satisfy the boundary conditions (2.9.7) 
for k =0; so it cannot belong to F. 

Similarly S { D?—x?, Uon(x), 4n+1; (0, 2) } is a conservative system of 
type T,, the regular boundary condition being u’(0)=0. Again f(x) =1 be- 
longs to B{* { D?—x?; 0, 1; [0, ©) } but not to the corresponding class F for 
which the singular boundary condition is still given by (2.9.7). 

Combining Theorems 5 and 7 we get 


THEOREM 8. Let S=S { L, un(x), Mn; (a, 5) } be a conservative system and let F 
be the corresponding set of admissible functions. Let II(u) be a polynomial in u 
with real coefficients and real positive zeros. Then I1(L) is an oscillation preserv- 
ing transformation in (a, b) with respect to F. 


2.4. The main theorem. We shall now prove 


THEOREM 9. Let S=S{L, uin(x), un; (a, 6)} be a conservative system and let 
F=F{L, un(x), un; (a, 6)} be the corresponding set of admissible functions. Let 
f(x) EF and suppose that 


1942] DIFFERENTIAL TRANSFORMS. II 


(2.4.1) lim inf V[L*f(x)] = N < . 


Then there exists an integer M=M(N) such that f,=0 for n>M, that is 


M 
(2.4.2) f(x) = 


If all characteristic values are simple, 

(2.4.3) V [uu(x)| N, 

otherwise it is at most N+-1. Conversely, if f(x) is given by (2.4.2), then 
(2.4.4) 

for all large k. 


For the proof we employ the device of Pélya and Wiener [2] in suitable 
modification. To the given function f(x) €F with Fourier coefficients f, we 
form the auxiliary function 

(2.4.5) &(x, k, m; f) = 

(Hm + Mn)? 
where k20, m21 are arbitrary integers. We have (x, 0, m; f) =f(x). The 
multipliers are positive numbers less than or equal to 1 and equal to 1 only 
when p, =m. Since every characteristic value is at most double, this means 
for at most two values of . It follows that the coefficients of ® also satisfy 
conditions C; and C2 so that ®C F. We can consequently apply the operator 


(L—jm) termwise as often as we please to the series (2.4.5). We find in particu- 
lar that 


(2.4.6) (L — m3 f) = 


But us. >0 and by Theorem 8 (ZL —uyz,,)”* is an oscillation preserving transfor- 
mation with respect to the class F in (a, 6). Hence 


(2.4.7) Ni = VIL*f(x)] = m; f)] 


for every k and m. 

So far m was arbitrary. We suppose now that f,,~0. In order to take 
care of the slightly more complicated case in which there are double char- 
acteristic values, let us suppose m-1=m and that also f,1+0. We then 
write ®=S,+5.+5;. Here S; is the finite sum from n=1 to n=m-—2, 
S2=fm—1tm—1(X) +fmtém(x), while S; is the remainder. The trivial modifica- 
tion necessary if u», is simple is obvious. We shall estimate S; and S;. The 
idea of the proof is to show that for sufficiently large values of k, | Si+Ss| 
is dominated by | S2| at the maxima of the latter, provided that we restrict 


| 483 


484 EINAR HILLE [November 


ourselves to a fixed interior interval (a1, b:), and that consequently the oscilla- 
tory properties of ® in this interval are essentially the same as those of S2. 
The latter, however, are regulated by assumption Ag which ensures that the 
number of sign changes of S2 in (a;, b:) tends to infinity with m. This will 
lead to a contradiction for suitable values of m. 

We consider now an arbitrary interior interval (a1, b:), a<ai1<b,<b. Let 
B=max U(x) for a1 Let 

(4m + bn)? 


for n#m—1 and m (n¥m, if um is simple). We have 6<1. By assumption As 


6 = max 


| Ss] | faten(x) |< fal U(x) S8BD| fl 
1 1 


for a15x<3b,. Here the prime after the summation sign indicates that 
n#m—1 and m. 

Let us write Zm(a:, 51) =jm. Assumption Ag asserts that jn—>0© with m. 
Since im is a double characteristic value, S2(x) =fm—1tUm—1(X) +fmUm(x) is a real 
solution of the differential equation (L+pmn) y=0. Consequently it has at 
least jn —1 and at most jn+1 zeros in (a1, b:) by the classical oscillation theo- 
rems. Let the actual number be im. We can suppose without essential restric- 
tion of the generality that the zeros of S2(x) are interior to the interval (a, b:) 
and that S2(a1)>0. Then sgn S2(b1:)=(—1)*. Let the zeros occur at the 
points Xa, di<xi<x2< Let & be the uniquely determined 
point between x, and x44: where Sz (x) =0. If S/ (x) =0 at a point in (a, x), 
we denote this point by £; otherwise we set &)=«a;. Similarly &;,, is either the 
point where Sj (x) =0 in (x;,,, 51) or b; itself. We note that the points £ and 
£;,, are uniquely determined. Now let 


o = min | S.(é.) |, a=0,1,---, im 
We then choose k so large that 
<a, 
which is obviously possible. But this means that for such values of k and m 
sgn k, m; f) = (— a =0,1,---, tm 


Hence (x, k, m; f) has at least 7, sign changes in (a;, b:) and a fortiori in 
(a, b). Since (L—pm)?* is an oscillation preserving transformation, formulas 
(2.4.6) and (2.4.7) show that L*f(x) also has at least 7, sign changes in (a, 5) 
for all sufficiently large k. Hence i,,< N; for all large k. But (2.4.1) asserts 
that N.=N for infinitely many values of k. This implies 7,,< N. 

This is a contradiction, however, for large m since i, tends. to infinity 
with m. Since im =jm—1, this gives a contradiction for jn, >N-+1. We are thus 


1942] * DIFFERENTIAL TRANSFORMS. II 485 


led to the conclusion that the characteristic series of f(x) cannot contain any 
term u,,(x) having more than N+1 zeros in (a1, 51). But here (a1, b;) is a per- 
fectly arbitrary interior interval. It follows that the series of f(x) contains no 
term u(x) with Z,,(a, b)>N+1. If there are no multiple characteristic val- 
ues, we can replace N+1 by WN since then jn =1». In the case of double char- 
acteristic values, however, it would seem possible for the finite sum to end 
with two terms corresponding to the same characteristic value, either term 
having N+1 zeros while their sum has only N zeros. Whether or not this 
exceptional case can ever arise must be left an open question. 

This argument proves formula (2.4.3) except in the periodic case. Here 
N=2K is an even integer. Further Z,(a, V[un(x)]SZ,(a, 6) +1 and 
V[un(x)] is even. If there are no double characteristic values then the in- 
equality Zu(a, b)SN=2K implies V[um(x)]<2K and formula (2.4.3) is 
proved. If however, uw =um-1, then the previous proof shows that S2(x) can- 
not have more than 2K sign changes in (a, 6) and hence V[S2(x)]<2K. Now 
um-—1(x), uu(x), and S2(x) are solutions of the same differential equation 
(L+pu) u=0 for Hence the three quantities V[us_s(x)], 
V[um(x)] and V[S2(x)] can differ by at most one unit and, being even in- 
tegers, they must consequently be equal. This shows that V[us(x) | <2K and 
formula (2.4.3) is proved. 

Suppose, conversely, that f(x) is a finite sum of characteristic functions 
given by (2.4.2). We choose m= M and form ®(x, k, M; f). For (ai, bi) we 
take an interval containing all zeros of us(x) or S2(x) as the case may be. 
Proceeding as above, we see that ®(x, k, M; f) has at least as many sign 
changes in (a, 5:) as the last term or group of terms has for large values of k. 
Combining with (2.4.7) we see that (2.4.4) holds. The argument is evidently 
also valid in the periodic case. It is often possible to exclude the sign of in- 
equality both in (2.4.3) and in (2.4.4). This completes the proof of Theorem 9. 

Pélya and Wiener proved that if f(x) is periodic and V[D**f(x) | is bounded 
with respect to k, then the Fourier series of f(x) cannot contain any high fre- 
quency terms. Theorem 9 shows that this result has analogues for general 
orthogonal series defined by boundary value problems relating to linear sec- 
ond order differential equations, the operator D? being replaced by L. 

In §§2.5 to 2.11 below we shall give special instances of the theory. 

2.5. Sturm-Liouville operators. We have the following simple results(") : 


THEOREM 10. Let pm(x) EA [a, b], po(x) SO, >0,a Sx Sb. Let {un(x)} 
and {um},=0,1,2,---, be the sets of characteristic functions and correspond- 
ing characteristic values of the boundary value problem 


(L + w)u = 0, u(a) = 0, u(b) = 0. 


(*) We state the assumptions in Theorems 10 and 11 explicitly since they are so simple. 
Formulations in terms of the previous postulates are given below. 


486 EINAR HILLE ° [November 


Then S{L, un(x), un; (a, b)} is a conservative system. If f(x) Bi” {L; [a, b]}, 
that is, if f() EC™ [a, b] and L*f(a) =0, L*f(b) =O for n=0, 1,2, --+,and 


lim inf V[L*f(x)] = N, 
ko 
then 


N 
f(x) = futtn(x). 


We are assuming the validity of Ai, Az, T1, and D; and have first to show 
that they imply A; to As. Now this is the classical Sturm-Liouville system 
except for the restrictive assumptions of analytical coefficients which are un- 
necessary in the boundary value problem but desirable for our special needs. 
Referring to the literature for proofs (see for instance E. L. Ince [1, §§10.61, 
10.7, and 11.4]), we list the following properties of the solutions. We put 


b 
= (1/w) f w= (1/m) f 


Pn = Un. 


Then: 

(1) The characteristic values are real, positive, and simple. 

(2) pa=n+1+O(1/n). 

(3) un(x) =An[P2(x) po(x) { sin (pn2)-+O(1/n) }, where A, is a normal- 
izing factor, independent of x and bounded with respect to (1). 

(4) V[un(x)] =m. 

(5) { is complete in L2(a, d). 

These properties show that conditions A; to As are amply satisfied. Thus 
S is a conservative system and Theorem 9 holds for the corresponding 
set F of admissible series. It was shown in Theorem 7, however, that 
F=B‘°{L; [a, b]}. Finally, M(N) =N by virtue of property (4). This com- 
pletes the proof of Theorem 10. 

The same result is valid for more general boundary conditions, for ex- 
ample, u(a) = 0, Ciu(b) + C2u’(b) =0, C120, C2 >0 and, for u’(a) =0, u’(b) =0. 

2.6. Periodic operators. Here we also have simple results. 


THEOREM 11. Let pn(x)€Al[a, m=0, 1, 2; p2(b)¥0, 
K(a) =K(b). Let {un(x)} and {pun} be the sets of characteristic functions and 
corresponding characteristic values of the boundary value problem 


(L+p)u=0, u(a)=u(d), = w'(b). 


Then S{L, un(x), (a, 6)} is a conservative system. If f(x)EBS°{L; [a, b]}, 
that is, if f(x)EC™|[a, and L*f(a)=L*f(b), DL*f(a)=DL*f(b), n=0, 1, 


(#5) An—{2/(mw) when n— ©, 


2 2 
Ay 


DIFFERENTIAL TRANSFORMS. II 


lim inf V[L*f(x)) = 2K, 


2K 
f(x) = x). 


Here we assume Ay, As, T2, and De, and want to conclude that A; to Ag 
hold. In the present section V[g] is to be determined according to the defini- 
tion for periodic functions. The available information concerning the solutions 
of the boundary value problem is quite precise in the case of equations with 
periodic coefficients and only slightly less so in the general case. We refer to 
E. L. Ince [1, §§10.8, 10.81, and 11.4], where the reader will find further 
references to the literature. In the notation of the preceding section we obtain: 

(1) The characteristic values are real, non-negative but need not be simple. 

(2) px = [(n+1)/2]+O(1). 

(3) | n(x) | sU,asxsb,n=0, 1, 2, 

(4) V [uo(x) 0, V [wam—1(x) J= V [am (2) 2m. 

(5) { [P(x) ]*2un(x) } is complete in L,(a, 5). 

If K(x) and G(x, \) are even periodic functions of period (b—a), the re- 
mainder term in (2) can be replaced by O(1/m) and (3) caf be replaced 
by formulas of type (3), in §2.5 with sine replaced by cosine when m is 
even(!*), The properties as listed are, however, more than sufficient to 
prove, that A; to Ag are satisfied so that S is a conservative system. Since 
F=B§°){L; [a, b]} by Theorem 7, and M(N)=2K by (4), Theorem 11 is 
proved. 

The simplest of all operators satisfying the cuidtiiias of Theorem 11 is 
L=D?, In this case the theorem reduces to Theorem I of Pélya and Winner. 
A less trivial instance is given by the operator of Mathieu 


L = D* — (A + B cos 2x) 


to which corresponds expansions in terms of the functions of the elliptic cyl- 
inder. 

The remaining sections of the chapter will be devoted to special instances 
of singular and semi-singular operators. 

2.7. The Legendre operator. We consider the case 


(2.7.1) Lily] =(1— a=—-1, b=+1. 


The end points of the interval are singular and we find that R(x, 0; d) 
= —(A/2) log (1—x?)—> © when x—>+1. It follows that the problem is of type 
Ts. The corresponding singular boundary value problem 

_ (4) It is not difficult to show that similar formulas hold also in the general case considered 


in §2.6. We have merely to replace paz by pnz-+n where ¢, is a suitable phase angle, determi- 
nable with an error which is O(1 /m). Property (3) is an immediate consequence of such formulas. 


1942] 487 

then 


488 EINAR HILLE [November 


(2.7.2) — x*)Du] + uu = 0, u(x)/log (1 +1 


has as its characteristic functions the Legendre polynomials P,(x) with cor- 
responding characteristic values m(n+1), n=0, 1, 2,--- (7). We take 
Un(x) =(n+(1/2))*P,(x). We shall prove that the system S{L, un(x), bn; 
(-1, 1)} is conservative. We know to start with that A;, As, T;, and D; are 
satisfied. It is well known that A; holds and so does obviousiy A,, except for 
that fact that the least yu, is zero. This is immaterial, however(!*). We have 
| P,,(x)| <1, | Pg (x)| Sn(n+1)/2,s0 that As and E; are satisfied. Further the 
n zeros of P,(x) are all located in (—1, 1) and the maximal distance between 
consecutive zeros is O(1/m) so that Ag is valid. Thus the system is actually 
conservative and Theorem 9 holds for the corresponding class F. 

We have now to determine what functions are represented in [—1, 1] by 
expansion of the form | 


> a,P,(x), > n™ | an | < 
n=0 1 


for all m, a, being real. It is obvious that the series as well as all derived 
series converge uniformly in [—1, 1] so that f(x) €&C™[—1, 1]. Conversely, 
if f(x) EC™[—1, 1] so do all its L-transforms. From this we conclude readily 
that f(x) € F. Hence we have shown that 


(2.7.3) F{D(i—x)D, (n+(1/2))*/2P,(x), m(n+1); (—1, 1)} =C@[—1, 1]. 


This fact gives the following formulation of Theorem 9 for the Legendre 
operator: 


THEOREM 12. If L=(1—x*)D?—2xD, f(x)EC™[—1, 1], and lim inf,.. 
V[L*f(x)] =, then f(x) is a polynomial of degree N. Conversely, every real 
polynomial of exact degree N has the property V|L*f(x) |= N for all large k. 


In order to prove the converse, we merely express the given polynomial 
in terms of Legendre polynomials. The expression will involve the term Py(x) 
with a coefficient different from zero. By (2.4.4) V[L*f(x)]2N for all large-k. 
Since L*f(x) is also a polynomial of degree N, we must have V[L*f(x)]=N 
for all large k. 

Theorem: 12 has also been proved by Szegé [4, Theorem C] by a different 
method. 


(7) The proof of this statement goes as follows. The only solution u(x) satisfying the bound- 
ary condition at x=1 is a multiple of F(a+1, —a, 1, (1—x)/2) where a(¢+1) =y. This solution 
becomes logarithmically infinite at x= —1 unless a is an integer when it reduces to P,(x). 

(18) The fact that 4o=0 means merely that the case N=0 is not covered by Theorem 9. 
But if N=0<1 we can still conclude that f(x) =a+-bx and since L*(a+-bx) = (—2)*bx, we must 
have b=0. Hence Theorem 12 is also valid for N=0. A similar argument takes care of the other 
cases encountered below in which the least characteristic value is zero. 


% 
q 


1942] * DIFFERENTIAL TRANSFORMS. II 489 


2.8. Jacobi operators. Analogous results can be proved for the Jacobi op- 
erator 


(2.8.1) L=(1— (a+ 6+ 2)x|D 


for the interval (—1, 1). Here the end points are again singular. A simple 
calculation shows that R(x, 0; A)—>* when x->1 if and only if a20, and 
when x—>—1 if and only if 820. Thus the problem is of type T; if and only 
if 220,820. We shall suppose a >0, 8 >0, since the limiting case a=0, 8B =0, 
reduces to Legendre’s operator('*). The corresponding singular boundary 
value problem can then be formulated as follows: 


(2.8.2) (L+yu)u=0, u(x)(1 — x)*(1+ 2)? «> +1, 


since the calculation shows that (1—x)*(1+x)* y(x; 0, A) is bounded away 
from zero and infinity in (—1, 1). The solutions are given by the Jacobi 
polynomials u,(x) =A,P&” (x), u,=n(n+a+B8+1), where A, is a normaliz- 
ing factor. The reader will find in the treatise by Szegé [5, §§3.1, 7.32, and 
8.9], the necessary information regarding Jacobi polynomials required to 
show that S {L, Un(X), bn; (—1, 1) } is a conservative system. We show as in 
ont “i corresponding set F of admissible functions is identical with 
c™|—1, 1}. 

It follows that Theorem 12 remains valid if we replace the Legendre opera- 
tor by the general Jacobi operator (2.8.1) provided a20, 820. Professor 
Szegé has kindly informed me that the theorem actually remains true for 
a>-—1,8>-—1 and that this can be proved both by his method used in [4] 
and by a suitable modification of mine. We note, in particular, the case 
a= = —1/2 which leads to the polynomials of Tchebycheff. By his method 
Szegé is also able to prove that if a and 8 are arbitrary real numbers, then the 
assumptions f(x) EC [—1, 1] and lim inf,.. V[L*f(x) ]=N imply that f(x) 
is a polynomial of degree at most N+ M(a, 8) where M(a, 8) is an integer 
depending only upon a and 8. Detailed proofs will be presented in a later note 
in this series. 

2.9. The Hermite and Hermite-Weber operators. We consider next the 
two operators 


which we refer to as the Hermite and Hermite-Weber operators respectively. 
The interval (a, b) will be (— ©, ). Since 


(2.9.2) = (Li — 


the two operators can be treated simultaneously. The Hermite-Weber case 


(**) The cases a=0, 8>0 and a>0, 8=0 can also be handled by the same method. One of 
the powers occurring in (2.8.2) should then be replaced by a logarithm. 


490 EINAR HILLE ' [November 


is easier to handle directly, but the final result is more striking if expressed 
in terms of the Hermite operator. 

We concentrate the attention on L». The point at infinity is singular and 
R(x, 0;’)-—> © when | x| —« for \>0. The problem, therefore, is of type Ts. 
The function y(x, 0; A) is a constant multiple of D,(2*/*x)+D,(—2"/*x) where - 
D, is the parabolic cylinder function of Whittaker and m= —(1+A)/2. For 
large values of | x| we have consequently (cf. E. T. Whittaker and G. N. 
Watson [6, §16.52]) 


(2.9.3) y(x, 0; A) = BA) | exp [ + o(1)}. 
The singular boundary value problem 
(2.9.4) (D?+u— x*)u = 0, u(x)| exp [— «?/2] 0, | «| @, 
has for solutions the Weber-Hermite functions(?°) 
n(x) = = An + 1. 


It is well known that this system is complete in L2(— ©, ©). Condition As is 
fulfilled since(?") 


(2.9.5) | | S Bi, | (x) | 


All zeros of u,(x), zeros’of the nth Hermite polynomials, are real and located 
in the interval (—y}”, u)/”). They are densest towards the center of the interval, 
but the minimum distance between consecutive zeros is of the same order as 
the average distance. It follows that the conditions A; to Ag and E; are satis- 
fied. Hence S is a conservative system and Theorem 9 applies to the corre- 
sponding set F. 

The determination of the class F is much more laborious than in the Le- 
gendre case. We know that 


n=0 n=l 


for all m. The series is uniformly convergent in the infinite interval by 
virtue of (2.9.5) and the terms tend to zero as |x|—>0. It follows that 
f(x)ECO[— ©] where the subscript 0 indicates that f(x)—-0 when 


(2°) For the proof it is enough to observe that the only solution which satisfies the boundary 
condition when x—>© is a multiple of D,(2/x), »=(u4—1)/2, and that this solution, as is seen 
from its asymptotic representation, does not satisfy the boundary condition for x—>— © unless 
pis a positive odd integer. 

(2). Better estimates are available. For the. first. inequality see Szegs [5, Theorem 8.91.3]. 
The second inequality follows from the first combined with formula (2.9.6) below. For the 
properties. of Hermite used in discussion see also §§5.5, 5.7, 6.31, and 6.32 of 
Szegi’s treatise. 


- 


1942] DIFFERENTIAL TRANSFORMS. II 491 
|x|—+0. It is obvious that all L-transforms also belong to C{?[— ©, ]. 
But much more can be asserted. 

To this end we note that f(x) GF implies x f(x) and f’(x)EF. This is a 
consequence of the relations 


tn, (2) 
which in their turn follow from the recurrence formulas for Hermite poly- 
nomials plus the relation H,! (x) =2n H,-1(x). Hence 


af(x) 

f'(x) 

and these series are clearly elements of F. By induction we show that 

(%) for every k and m. But this implies x»f(x) ECO [— for 
all k and m. 

Conversely, suppose that xf(x)ECO[— 0, ©] for all and m. This 

implies that x"L*f(x) has the same property and consequently L*f(x) 

EL,(— ©, ©) for every k. But if g(x) is a function satisfying the same condi- 


tions as f(x), an application of Lagrange’s identity combined with the bound- 
ary conditions gives 


= 


and in particular 


= f = (— 1)*(2n + 1)*f,. 


We conclude that the coefficients f, must satisfy conditions C; and C, and 
that f(x) € F. Consequently we have proved: 

The class F{ D?—x?, n(x), 2n+1;(— ©, } is that subset of ] 
the elements of which satisfy the boundary conditions 
(2.9.7) lim = 0, 

|z| 

From this we get without difficulty (2?) : 

The class F{D*—2x D, AnH,(x), 2n; (—, ~)} is that subset of 
C(— «0, ©), the elements of which satisfy the boundary conditions 


(2.9.8) exp [— x?/2] f(x) = 0, k,m=0,1,2,---. 
We can consequently formulate Theorem 9 as follows for the case of the 
Hermite operator. 


(#) A, is a normalization factor. 


492 EINAR HILLE . [November 


THEOREM 13. Let L=D*—2x D. Let f(x)EC™(— «©, ©) and satisfy the 
boundary conditions (2.9.8). If lim inf,.. V[L*f(x)]=N, then f(x) is a poly- 
nomial of degree N. Conversely, every real polynomial of exact degree N has the 
property V[L*f(x)]=N for all large values of k. 


2.10. The Laguerre operator. Our last example of a singular operator is 
that of Laguerre 


(2.10.1) L = xD? + (1 — x)D, 
the interval being (0, ©). The equation 
(L — d)y = xy” + (1 — — = 0 

has singular points at 0 and «. A simple computation shows that R(x, 1; d) 
— © at both points, so the problem is of type T;. The origin is a regular singu- 
lar point with indicial equation p? =0. Hence there is a solution which becomes 
infinite as log (1/x) while the other solution is regular at x =0. The point at 
infinity is irregular-singular. Assuming x and X positive, we have one solution 


tending to zero as x~* and another tending to infinity as x*—'e* when x ©. 
It follows that 


A(x) log (1/x) {1+ 0(1)}, 0, 
B(A) {1 + o(1)}, 
The singular boundary value problem 


u(x) 
———— = 0, _ lim u(x) = 
(for every \>0) determines the Laguerre polynomials L,(x) corresponding 
to the characteristic values n=0, 1, 2,--- (%). 
The system {e-/2L,(x)} is complete in L2(0, ©). It was proved by Szegé 
that 


y(x,1;r) = { 


(2.10.2) (L + u)u = 0, 


Since 


(2.10.3) Ly (zx) = L,(x), 
v= 


(#) To prove that no other solutions exist is fairly complicated. We shall merely outline an 
argument which the interested reader will be able to complete. There exist two formal but 
asymptotic solutions of the form x“$,(1/x) and e*x~!-"B,(1/x), of which only the first one satis- 
fies the boundary condition at infinity. The series are easily computed. If «=m the first series 
terminates and reduces to a multiple of L,(x). For other values of u it may be summed by Borel’s 
method which leads to the result u(x) =x" [2 F(—y, —p, 1, ~t)e~*‘dt. The behavior of the in- 
tegral for small positive x is determined by that of the hypergeometric function for large ¢. 
If « is not zero or a positive integer, F(—u, —p, 1, —t)=A(u) log t[1+0(1)], A(u) #0, for 
large t, and u(x) becomes logarithmically infinite when x-+0. Thus the Laguerre polynomials 
are the only solutions of the boundary value problem.—For the properties of L,(x) used in 
this section, see Szegé [5, §§5.1, 5.7, 6.31, and 7.21]. 


1942] DIFFERENTIAL TRANSFORMS. II 493 


we get 
(x) | <n, 


These inequalities show that As and E; are satisfied if we take U(x) =e*/?. 
For later use we note the recurrence formula 
(2.10.4) (m + + (4 — 2m — 1)L,(x) + mL, 1(x) = 0. 


The zeros of L,(x) are all real positive and the mth zero equals 
Cnn(m+1)2(n+1)—' where 1/4SC,,,, $4. It follows that A, also holds and S 
is consequently a conservative system. 

It remains to determine the class F. If f(x) EF then 


(2.10.5) fla) = < 


for all m. Multiplying on both sides in (2.10.5) by e~*/? we obtain a series 
which is uniformly convergent in [0, ] and the terms of which tend to zero 
when x. It follows that e~*/*f(x) is continuous in [0, «] and tends to 
zero when x—> ©. We show next that x f(x) and f’(x) must belong to F when- 
ever f(x) does. Multiplying both sides of (2.10.5) by x, reducing with the aid 
of (2.10.4) and rearranging, we obtain the series 


which clearly belongs to F. Similarly we obtain the series 
fr) La) 
n=O \ 
from (2.10.5) with the aid of (2.10.3). This is also an element of F. It follows 
that any function of the form xf“ (x) € F and 
(2.10.6) lim «e~*/2f() (x) = 0, k,m=0,1,2,---. 

At the origin we find of course that f(x) tends to a finite limit for every k. 

Conversely, if f(x)EC™[0, ©) and satisfies the boundary conditions 


(2.10.6), then e~*/*L*f(x)EL2(0, ~) for every k. Using Lagrange’s identity 
we verify that 


f = (— n)*fa 


and from this we conclude that f(x) € F. 
The class F{xD?+(1 —x) D, L,(x), n; (0, @) } equals the subset of 
C[0, 0) the elements of which satisfy conditions (2.10.6). 


THEOREM 14. Let L=x D?+(1—x) D. Let f(x)EC™[0, ©) and satisfy the 


494 EINAR HILLE , [November 


boundary conditions (2.10.6). If lim inf,.. V[L*f(x)]=N, then f(x) is a poly- 
nomial of degree N. Conversely, if f(x) is a real polynomial of exact degree N, 
then V[L*f(x) ]=N for all large values of k(). 


Theorems 12, 13, and 14 give three distinct unique characterizations of 
real ordinary polynomials in terms of their behavior with respect to certain 
second order differential operators. This is analogous to the unique character- 
ization of trigonometric polynomials by means of the operator D? given by 
Pélya and Wiener. 

2.11. Bessel operators. Our last examples will deal with semi-singular 
operator problems related to the theory of Bessel functions. In this theory we 
find essentially three different types of expansions, conventionally referred to 
as the Bessel-Fourier, the Neumann, and the Schlémilch series. Only the first 
type falls directly under our theory, but the third type is also accessible to 
the methods of Pélya and Wiener. 

We start with the operator 


(2.11.1) L = + D/x — 


where m20 is fixed. We take the interval (0, 1) of which one end point is 
singular and the other regular. It is easily seen that R(x, 1;) > © when x0 
so the problem is of type Ty. We have y(x; 1, A)~B(A) log (1/x) or B(A)x™ 
at x =0 according as m=0 or is greater than 0. The corresponding boundary 
value problem for m>0 is 


(2.11.2) (L + = 0, lim x™u(x) = 0, Cyu(1) + Cw'(1) = 0, 


where C:20, C220, Ci+C.>0. If m=0 the factor x should be replaced by 
[log (1/x) The problem has as its solution the set where runs 
through the positive roots of the equation 


(2.11. 3) Cw] m(u) + CoJ m(u) = 0. 


If m=0, C.=0, we have to add op = 0 with uo(x) = 1(%). 

Using any standard text on Bessel functions, the reader will have no diffi- 
culties in proving that the corresponding system S is conservative. We note 
in particular that Z,(0, 1) = V[Jm(unx) |=” so that M(N) = N in Theorem 9. 
We shall not state the corresponding form of the theorem, but we shall de- 
termine the class F. 

If f(x) FC BS” { L; Ci, C2; (0, 1]} then we must have 


(2.11.4) CiL"f(1) + CDL*f(1) = 0, n=0,1,2,---, 


(*) Similar results hold for the general Laguerre operator L=xD*+-(1+a—x)D, a>—1. 

(5) The only solution which satisfies the boundary condition at x=0 is a multiple of 
Jm(ux). The values of u are determined by the second condition. The root 4=0 figures if and 
only if C,/C,= —m. Owing to our sign restrictions, this case occurs only if m=0, C,=0. 


} 


1942] DIFFERENTIAL TRANSFORMS. II 495 


with our usual notation. At the singular end point we can write f(x) =xg(x). 
A simple computation shows that if g(x) is defined in [—1, +1] by the con- 
vention g(—x) =g(x) then g(x)EC™[—1, +1]. 

Suppose, conversely, that f(x) = x™g(x) where g(— x) = g(x), g(x) 
Ec [—1, +1], and (2.11.4) is satisfied, The computation shows that Lf(x) 
satisfies the same conditions. We can then apply the identity of Lagrange and 
find that 


f XUy(x)L*f(x)dx = = (— pn) *fa, 


since all intermediary integrated expressions vanish at both end points of 
(0, 1). It follows that the coefficients f, satisfy the conditions C; and C2 of 
§2.1 and f(x) EF. 

Thus, the class F{L, AnJm(unX), Un; (0, 1)} consists of all functions of 
the form f(x) =x™g(x), satisfying (2.11.4), such that g(—x)=g(x) and g(x) 
[—1, +1]. 

The Neumann series 


(2.11.5) n(x) 


n=O 


does not give rise to any interesting oscillation problems for the simple reason 
that in any fixed interval (0, 6) the function J,(x) is ultimately non-oscilla- 
tory since the least positive zero of J,(x) exceeds n. 

We get more interesting results for the Schlémilch series 


(2.11.6) (fo/2) + SnJ 


the terms of which are characteristic functions of the operator (2.11.1) with 
m =(Q corresponding to the characteristic values m*. The corresponding sys- 
tem S is not admissible in the technical sense of §2.1, since the functions 
{ Jo(nx) } do not form an orthogonal system. But the methods employed in 
the present paper nevertheless apply and lead to a result which we state with- 
out proof(?*), 


THEOREM 15. Let F be the class of functions defined by the formula 
(2.11.7) f(x) = 2/a) f g(x sin é)dt, g(u) = (fo/2) + > fa cos nu, 
0 


n=l 
where g(u) is any real even function of period 2m belonging to C(— 0, ©). Let 
N(R) be the number of sign changes of { D?+(1/x) D}* f(x) inthe interval 


(8) See E. T. Whittaker and G. N. Watson [6, §17.82], for the relation between the series 
(2.11.6) and (2.11.7). 


‘ 
x 
4 


496 EINAR HILLE 


(0, R). If 
lim inf lim sup V;(R)/R = C < ~, 
ko 


then g(u) is a trigonometric polynomial of degree at most Cr. 


APPENDIX 


3.1. Characteristic series representing entire functions. Pélya and Wiener 
[2, Theorem III], proved for periodic functions f(x) that the assumption 
V [D**f(x) ] =0(k*/?) implies that f(x) is an entire function. This result also ad- 
mits of far reaching generalizations but cannot be true for arbitrary conserva- 
tive systems. It is obviously necessary that the functions u,(x) themselves are 
entire. It is also necessary to have some definite information concerning the 
convergency properties of the series for complex values. In: this di- 
rection it is enough to know that a condition of the form 


lim sup | fa| exp (run) = 


for some finite 7, prevents the convergence of the series in the whole finite 
plane, while on the other hand the finiteness of the limit superior for every r 
implies that the series does converge in the whole plane. The matter is com- 
plicated by the fact that an entire function may have a characteristic series 
which is not convergent outside of the real interval (a, b) or even anywhere(?’). 
In addition it is desirable to have more precise information concerning the 
characteristic values and the degree of regularity of the oscillations of the 
characteristic functions in fixed interior intervals. It is not worth while stat- 
ing here in precise form the assumptions under which we have succeeded in 
extending Theorem III of Pélya and Wiener. It is enough to mention that 
the results apply to the operators of Legendre, Jacobi, Hermite and Laguerre. 
We state without proof: 


THEOREM 16. The condition V[L*f(x)]=o(k/?) is sufficient in order that 
f(x) =Dow-ofntn(x) shall define an entire function, the series being convergent in 
the finite complex plane, provided S=S{L, un(x), un; (a, 6)} is one of the five 
systems considered in §§2.7 to 2.10. 


For the case of the Legendre operator, this theorem has also been proved 
by Szegé ([4], special case of his Theorem D) and with a much less restrictive 
condition on the rate of growth of N;,. His method would also apply to the 
Jacobi case, at least for a>—1, 8>-—1, with a similar improvement of 
the rate of growth condition. His method, however, does not apply to the 
Hermite, Hermite-Weber, and Laguerre operators. 


(*") This happens, for instance, in the case of expansions in terms of Hermite and Laguerre 
polynomials, but not in the Jacobi and Legendre cases. 


[November 


1942] DIFFERENTIAL TRANSFORMS. II 497 


' 3.2. Upper limits for the frequency of oscillation. It has been conjectured 
(by Pélya, at least for the operator D) that o(k) is the correct order in theo- 
rems of the type of our Theorem 16 and that this order cannot be raised to 
O(k). The latter part of the conjecture has been proved by Pélya and Szegié 
[4, §7]. It is very easy to verify that O(k) is not admissible in the case of two 
rather wide classes of second order operators. 

Suppose first that (a, }) is a finite or semi-infinite interval and that the 
coefficients p(x) of L are polynomials. Take f(x) =1/(x—c), where c is real 
and outside of [a, 6]. A simple computation shows that the L*-transform of 
f(x) is a rational function whose denominator is (x —c)?*+! while the numera- 
tor is a polynomial of degree at most Ak, where A is a constant depending 
only upon the degree of the polynomials ,,(x). It is clear that for this func- 
tion V[L*f(x)]<SAk and f(x) is not entire. If (a, })=(—, ©), we take 
f(x) =1/(x?+c?) instead. 

If (a, 6) =(—7, ) and the coefficients p,(x) are trigonometric polyno- 
mials, we have similar results. We take f(x) =1/(2—sin x) instead. Here 
L*f(x) is the quotient of two trigonometric polynomials, the degree of the 
numerator being at most Ak. Hence V[L*f(x)]<2Ak and f(x) is not entire. 

Finally it should be observed that all the available evidence so far sup- 
ports the conjecture that V[L*f(x) ]=O(k) is a necessary and sufficient con- 
dition in order that an admissible characteristic series shall define an analytic 
function. 

REFERENCES 

1. E. L. Ince, Ordinary Differential Equations, London, 1927. 

2. G. Pélya and N. Wiener, On the oscillation of the derivatives of a periodic function, these 
Transactions, vol. 52 (1942), pp. 249-256. 

3. L. Schlesinger, Handbuch der Theorie der linearen Differentialgleichungen, Leipzig, vol. I, 
1895, vol. II,, 1897, vol. II, 1898. 

4. G. Szegé, On the oscillation of differential transforms. 1, these Transactions, vol. 52 
(1942), pp. 450-462. 

5, , Orthogonal Polynomials, American Mathematical Society Colloquium Publica- 
tions, New York, vol. 23, 1939. 

, 6. E. T, Whittaker and G. N. Watson, A Course in Modern Analysis, 4th edition, Cam- 
bridge, 1935. 


STANFORD UNIVERSITY, 

STANFORD UNIVERsITY, CALIF. 
YALE UNIVERSITY, 

New Haven, Conn. 


INTEGRATION IN A CONVEX LINEAR TOPOLOGICAL SPACE 


BY 
C. E. RICKART 


INTRODUCTION 


The results of this paper center around the definition of an integration 
process for multi-valued set functions which are defined over a a-field Mt and 
whose values lie in a convex linear topological space %. As such they represent 
a substantial generalization of the basic results contained in a paper by 
A. Kolmogoroff [7](*), who considered the case in which ¥ is the real numbers. 
On the other hand, the method of defining the integral is a generalization of 
that used by R. S. Phillips(*) [12, p. 118], although Phillips considered in- 
tegration only with respect to a positive numerical measure function and 
restricted the integral to be single-valued. The importance of the Phillips 
definition lies in the fact that it relieves one of the necessity of considering 
infinite sums. Throughout the paper is emphasized a type of convergence for 
sets in a linear topological space which is analogous to the Hausdorff notion 
of convergence for sets in a metric space [5, §28]. G. B. Price has made a 
similar use of the Hausdorff convergence for sets [13, Parts II, V]. 

The contents of the paper are divided into four parts. Part I (§§1-3) con- 
tains a short discussion of convex linear topological spaces, a definition of the 
notion of unconditional summability, which plays a central role in the defini- 
tion of the integral, and two theorems on additive set functions. Part II 
(§§4-9) contains the general theory of the U- and § U-integrals. The U-inte- 
gral is multi-valued and is defined for multi-valued set functions F(¢). The 
§ U-integral is the single-valued specialization of the U-integral. Definitions 
and basic properties of these integrals account for §§4, 5. Section 6 contains 
a discussion of a generalization of the Kolmogoroff [7] notion of differential 
equivalence applied to the U-integral, and §7 contains a proof that the trans- 
form of an integrable function by a general type of linear transformation is 
integrable. In §8 it is shown that the definition of the § U-integral can be 
weakened in case ¥ is complete in a certain sense. Section 9 contains a con- 
vergence theorem for the § U-integral which involves a generalization of the 
notion of approximate convergence to functions F(c) of the type considered 
here. The approximate convergence is relative to a positive numerical meas- 
ure function m(c) whose only relation to the integral lies in the condition that 

Presented to the Society, April 11, 1941; received by the editors September 8, 1941, and, 
in revised form, February 11, 1942. 

(?) Numbers in brackets refer to the bibliography at the end of the paper. 

(?) See also a paper by Garrett Birkhoff [2, p. 51] where the same definition used by 
Phillips is given. 


498 


INTEGRATION IN A TOPOLOGICAL SPACE 499 


the integrals of the functions considered be absolutely continuous relative 
to m(c). Part III (§§10—-13) is concerned with the Uz-integral which is a 
specialization of the § U-integral and may be described as integration with 
respect to a “bilinear” function. The “bilinear” function B[y, a] is a general- 
ization of m(c)y, where m(c) is a numerical measure function. It has its values 
in ¥ and is defined for y in a linear space 9) and a in the o-field M. It is linear 
in y for each ¢ and completely additive in o for each y. Functions y(c) to be 
integrated have their values in 9) while the value of the integral is in ¥. Sec- 
tion 10 contains a definition of the Uz-integral and a discussion of its funda- 
mental properties. In §11 the Us-integral is considered for the case in which 
¥ is complete in the sense of §8. Section 12 contains a discussion of absolute 
continuity and a convergence theorem for the Uz-integral. In §13 the exist- 
ence of the Uz-integral is proved for a certain class of measurable functions. 
Part IV (§§14-16) relates the above integrals to previously defined integrals. 
For the case in which % is the real numbers, the § U-integral includes an in- 
tegral of Kolmogoroff [7]. The Us-integral reduces in a special case to an 
integral of Phillips [12], and a specialization of the Us-integral includes an 
integral of Price [13]. Relation of the Uzs-integral to the various other in- 
tegrals which have been defined can be obtained through its relation to the 
Phillips integral (see [12, §7]). 


Part I. PRELIMINARY CONSIDERATIONS 


1. Convex linear topological spaces. The type of linear topological space 
% to be considered here is that introduced by J. von Neumann [11, p. 4]. It 
is defined as follows: 

A set ¥ of elements x is said to constitute a linear topological space pro- 
vided it is linear(*) (Banach [1, p. 26]) and provided it contains a family U 
of subsets such that 

(1) for every VEUV. 

(2) x€V for every implies x 

(3) Vi, VeEV implies the existence of such that(*) V3C Vil \ V2. 

(4) VEVimplies the existence of such that(§) V’+ V’CV. 
| VEU implies the existence of such that(1) aV’CV for all 

a| 31. 

(6) x©X¥ and VEU imply the existence of a such that xGaV. 
Also, ¥ is said to be convex provided 

(7) VEVimplies V+ VC2V. 


(*) Scalar multipliers are assumed real. The zero element will be denoted by 9. 

(*) The symbols C, (\, U denote, respectively, set-theoretic “included in,” “intersection” 
and “union,” and will be used with their usual variations throughout the paper. A(-\CB denotes 
the set of points contained in A but not in B. 

VitVe= V2}, where {x|P} denotes the set of all elements x sub- 
ject to the condition P, Similarly, aV = {ax|x€V}. 


500 C. E. RICKART : [November 


A set G in & is defined to be open provided, for every x€G, there ex- 
ists VEU such that x+VCG. The interior of a set X is defined by 
X;={x|x+VCX for some VEV}. X; is evidently open. A set is defined to 
be closed if it is the complement of an open set, and the closure of a set X 
is defined by X.1= C((CX),). Evidently X.: is. closed. The above class of open 
sets defines a regular Hausdorff topology in ¥ so that the operations of addi- 
tion and multiplication by a number are continuous [11, Theorem 6]. It is 
known (Wehausen [16, Theorem 1]) that this topology is equivalent to that 
introduced by Kolmogoroff [8, p. 29]. In all that follows ¥ will be assumed 
to be a convex linear topological space as above defined. An important con- 
sequence of the convexity of the space % is that the closure V.; of every VEV 
is a convex set; that is, C,V..= where C,X = {Sr a; 20, 
,0;=1, arbitrary}. This implies that = Ver for arbi- 
trary a;20. Another important consequence of the convexity of ¥ is that 
0<a<8 implies aV.,;C8V for every In much of the material which 
follows, the assumption that ¥ is convex could be avoided ; however computa- 
tion is greatly simplified by using it and some of the important theorems 
(for example, Theorems 3.1, 5.5, 9.5) seem to involve it rather deeply. 

It will be desirable later(*) to subject ¥ to a completeness condition which 
we introduce here. It is convenient to define the condition in terms of the 
Moore-Smith [10, p. 103] general limits notion on a class £ with a transi- 
tive compositive relation R on C{. A set {x:} of elements in %, where / ranges 
over £, is called an C-directed set. {x:} is called a fundamental (-directed set 
provided, for every VE, there exists Jy such that /;Rly (¢=1, 2) implies 
x1,—%1,E V. The space % is said to be complete relative to < provided every 
fundamental {-directed set converges (in the Moore-Smith sense) to an ele- 
ment of ¥. We will be interested in two important specializations of this com- 
pleteness notion, for example, the case where { is the set of positive integers 
and R is the usual order relation “>,” which gives the ordinary sequential 
completeness, and the case where £ is the family of neighborhoods U and R 
is the set-theoretic “included in” (that is, ViRV2 means ViC V2)(7). It is easy 
to prove that, if ¥ satisfies the first countability axiom (Hausdorff [5, p. 229]) 
and is sequentially complete, then it is complete relative to UV. It follows that, 
if ¥ is a Banach space with its norm topology (that is, U is the family of 
spheres with center 6), then X¥ is complete relative to VU. 

2. Unconditional summability. In the following, 7 will always denote a 
finite set of positive integers and 71272 will mean that 7 contains 7. Also, 
>. will mean summation over those nEx. Two subsets X, Y of ¥ are said to 
be equal within V provided XC Y+V and YCX+V. 


(*) See Theorems 8.5, 12.3 below. 

(7) The space ¥ may be said to be complete provided it is complete relative to every ”, This 
— can be shown to be equivalent to'simply completeness relative to ‘U (see Graves [4, 
p. 62}). 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 501 


2.1. DEFINITION. Two sequences {X,}, {Xa} of subsets of ¥ are said to 
be summably equal within V provided there exists a mo such that =m implies 
Xn, are equal within V. 


2.2. Derinition. A sequence {X,} of subsets of ¥ is said to be uncondition- 
ally summable (we write u.s.) to a set X with respect to V provided there exists mo 
such that implies X,>_.X,, are equal within V. 


Observe(*) that {X,} is u.s. to a single element x€X with respect to V 
provided there exists 7» such that +27» implies + {}°>,X,—x} CV. This spe- 
cial case of the above notion of unconditional summability was used by R. S. 
Phillips [12, p. 118]. Observe also that, if each of the sets X,, X consists of 
only a single element, then u.s. of {X,} to X with respect to every VEU 
reduces to the ordinary unconditional convergence. 


2.3. THEOREM. In order that {X,} be u.s. to X with respect to V it is both 
necessary and sufficient that, for every rearrangement n(k) of the sets X,, there 
exists ko for which k=ko implies X, >-*_,X ax are equal within V. 


The method of proof used in an analogous situation involving uncondi- 
tional convergence of series in normed vector spaces can be applied with 
slight modification here (see Hildebrandt [6, p. 90]). 

3. Additive set functions. Let M be an abstract set of elements and Ita 
o-field(*) of subsets of M. Elements of J? will be denoted by a. Also let m(c) 
be a positive, completely additive measure function defined over M. A single- 
valued function x(¢) on M2 to the space % is said to be completely additive if for 
every sequence {g,} of disjoint elements of Mt, the series }.x(o,) is uncondi- 
tionally convergent to x(Uc,). x(¢) is said to be absolutely continuous relative 
to m(c) provided, for every VE, there exists dy >0 such that x(a) € V when- 
ever m(q) < dy. 


3.1. THEOREM(°). If is completely additive on It to and if m(¢) =0 
implies x(o) =0, then x(o) is absolutely continuous relative to m(c). 


Suppose(!") the theorem not true; then there exists a VEU and a se- 
quence of elements o,€M such that lim,... m(Uy_,0.) =0 and such that(!*) 
\|x(ox)||y>2 for each k. Now, by a result of Wehausen [16, Theorem 8], 
there exists a linear continuous operation #, on ¥ to the real numbers such 
that |4(x)| <||x||y for all x and 4:(x(0:))=||x(ox)||v. It is obvious that 


(8) Unconditional summability obviously involves a convergence notion related to the well 
known Hausdorff convergence for sets in a metric space [5, §28]. 

(*) The o-field Mt has the following properties: (1) Pt contains the empty set; (2) if cEM, 
then M(\CcEM; (3) if onEM (n=1, 2,---), then 

(3°) See Dunford [3, Theorem 42] and Kunisawa [9, pp. 68, 69]. 

(4) The method of proof used here was suggested to the writer by R. S. Phillips. 

(#2) ||x|ly is the von Neumann pseudo-norm; that is, (allt, ~alit), where 
=g.L.b. {ala>0, xEaV} [11, pp. 18, 19]. 


502 C. E. RICKART ; [November 


f(c) = %:(x(¢)) is a completely additive, real-valued function of ¢. such that 
m(c) =0 implies f(7) =0. Therefore, by a well known theorem (see Saks [15, 
Theorem 13.2, p. 31]), f(c) is absolutely continuous relative to m(c). It fol- 
lows that there exists an m such that | f(UZn(oiMox))| <1 for n2m. 
Now, if we take(#) of ], it is clear that f(o1) =f(0%) 
+ 04)). But f(or) = = ||x(0%)||y > 2; therefore, f(o%) 
= #(x(2)) >1. Since | #:(x(o%))| we have >1. Moreover, 
o9(\c,=0 for all n2m. Repeating the above procedure on the sequence 
* Obtains a ofCon, and an m2 such that =0 for n2m2 
and ||x(02)||y>1. This process can be continued indefinitely to obtain a se- 
quence {o,} of disjoint elements of M such that ||x(o%)||y>1 for all . But 
this result obviously contradicts the assumption that x(¢) be completely ad- 
tive; therefore x(¢) is absolutely continuous relative to m(c). 


3.2. THEOREM. Let each of the functions x,(c) be additive and absolutely 
_ continuous relative to m(a). Then, if {xn(o)} is a fundamental sequence for each 
o, the are equi-absolutely continuous relative to m(c). 


A proof indentical with that which gives Theorem 6.1 of the Phillips paper 
[12, p. 125] applies here; so will be omitted. The method of proof is due to 
Saks [14]. 


: Part II. THE GENERAL THEORY 
4. Definitions of the U- and § U-integrals. Let I? denote, as before, a 


o-field of subsets of an abstract set M. A subdivision of M into a finite or de- 
numerable number of disjoint elements of DM will be denoted by A= {o;}, 
where M=Uo; and o;\o;=0 (ij). A! is said to be finer than A? provided, 
for every oj, there exists a 0%, such that oj Co%,; we write A12A*. The product 
of two subdivisions is a subdivision defined by A'A*= {oj(\o7}. Evidently 
the product of two subdivisions is finer than either one of the subdivisions. 
Let Ay= {ox} be a given fixed subdivision and A* (k =1, 2, - - - ) an arbitrary 
sequence of subdivisions; then the subdivision A which coincides with A* on 
the set o4,(k=1, 2, - - - ) is called the sum of the A* over Apo. 

The functions F(c) to be studied are multi-valued and are defined over(*) 
M (that is, excluding the empty set) with values in ¥. Let A= {o;} be an ar- 
bitrary subdivision of M; we denote the sequence of sets { F(o(\o,)} by the 
symbol J(F, a, A). 


4.1. DEFINITION. The function F(c) is said to be U-integrable over oo pro- 
vided there is a set I(F, oo) CX such that, for every VEU, there exists Ay., for 
which A=Ay., implies J(F, oo, A) is u.s. to I(F, oo) with respect to V. The 


(#8) See Footnote 4 above. 

(#4) One could develop the following theory for functions defined over only a portion of Dt; 
however there would be a considerable loss of simplicity in the statement of definitions and theo- 
rems, 


% 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 503 


closure(") of the set I(F, oo) will be called the U-integral of F(a) over oo, and 
we write I(F, o0)c=JSe,F(do). Furthermore, if [,,F(do) consists of a single ele- 
ment, then F(a) is said to be § U-integrable over oo. 


4.2. THEoREM. Jf F(a) is U-integrable on ao, then [,,F(do) is unique. 


Suppose F(¢) U-integrable to each of the sets I:(F, a0), I2(F, oo). Then 
it is immediate from the definition that, for every VC, there exists a Ay,, 
such that J(F, oo, A) is u.s. to both I,(F, oo) and I2(F, oo) with respect to V 
for A2Ay,,; that is, if A= {o;} , then there exists 7» such that 27» implies 
> +F(oo\o;), In(F, 70) are equal within V (m=1, 2). It follows immediately 
from this result that Ii(F, oo), I2(F, oo) are equal within 2V. Therefore, 
CIF, 0) +2V (k=1, 2; /=1, 2). Since V is arbitrary, oo) 
Fo)et; hence I(F, 

Observe that, if F(o/\oo) =@ for every o, then F(c) U-integrable on oo 
to the value @. 

5. Properties of the integrals("*). 


5.1. THEOREM. If F(c), G(o) are U-integrable on oo and a is any real num- 
ber, then aF(c), F(c)+G(c) are U-integrable on oo and 


J oF (ie) = a J F(de) + G(de) = | F(de) + cue) 


If a=0, the statement for aF(c) is obvious, and, if a0, the desired re- 
sult is a consequence of the fact that, if J(F, a0, A) is u.s. to I(F, oo) with 
respect to V, then J(aF, ao, A) is u.s. to al(F, oo) with respect to aV. 

In the case of F(¢)+G(c), we observe that, for arbitrary VEU, there 
exists a Ay,, such that, if A2Ay,,, then J(F, oo, A), J(G, oo, A) are, respec- 
tively, u.s. to [.,F(do), {«,G(do) with respect to V. From this it is immediate 
that J(F+G, oo, A) is u.s. to {,,F(do)+J.,G(do) with respect to 2V. Since V 
is arbitrary, the desired result follows. 


5.2. CoroLtary. If F(a), G(o) are U-integrable on oo, then 


= « f Fae), J Feo + G(dc) = G(@o). 


5.3. THEOREM. If F(c) is U-integrable on both o1, o2 and if o,/\o2.=0, then 
is U-integrable on o2 and 


(4%) The closure of a set X is denoted by Xz: It can be shown (see [i1]) that 
Xea=N(X+V) for VE V. Observe that, if F(c) is [-integrable to I(F, oo), then it is also U 
integrable to I(F, oo) «1. 

(*) The theorems of this section will be stated for the U-integral and are, of course, true 
for the § ‘U-integral. In case the results for the § ‘U-integral are stronger, we state them as cor- 
ollaries, 


C. E. RICKART 


5.4. CoROLLARY. If F(c) is § U-integrable on both a1, o2 and if o1/\o2=0, 
then F(a) is § U-integrable on and 


J... F(da) = f + ff 


5.5. THEOREM. The U-integral is a comtletely additive function of o in the 
sense that, if F(a) is U-integrable on each o* (k=0, 1, 2,-+ +), where 
and o"(\o"=0 (msn; m, n¥0), then { F(do) } (k=1, is us. to 
#F(do) with respect to every VEV. 


There is no loss in taking ¢°= M. Since the integral exists for each o*’ 
there exists A} such that, if A2Aj, J(F, c*, A) is u.s. to {.F(do) with respect 
to 2-*-!V (k=0, 1, 2,- ++). Denote by A; the sum of the A} (k=1, 2,-- - ) 
over the subdivision {o*} (see §4 above) and set Ay =A,A}. Evidently on the 
set o* the subdivision Ap is finer than the subdivision A}. Therefore J(F, o*, Ao) 
is u.s. to {+F(do) with respect to 2-*-'V. If Ao= {o;}, then there exists mo 
such that implies the equality of {uF(do) within V/2. Set 
my =max {n|o*(\U,,0;~0}; then, for an arbitrary (but fixed) n2my, there 
exists a 7, such that m, 270, imply o'(\o;=0 and 


C f F(de) + 2-*1V, f F(de) C F(o* + 
ok ok 


for k=0, 1, 2, - - - , m. It is obvious that ('”) 


DX Fe No) = 
k=l iGrn 
therefore, 


Tn k=l ok kml nr 


Now, since 7,27» we have immediately that 


where »2My is arbitrary. This argument does not depend on the order in 
which the sets o* are taken; therefore the desired result follows from Theo- 
rem 2.3. 


(7) It is understood that terms for which ¢*(\o;=0 are omitted. 


504 [November | 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 505 
5.6. COROLLARY. The § U-integral is completely additive in the ordinary 
sense. 
The proofs of the next two theorems are not difficult and will be omitted. 


5.7. THEOREM. Any function F(a) defined over IQ which is completely addi- 
tive in the sense of Theorem 5.5 is U-integrable on every o and F(c).1=J.F(do). 


5.8. THEOREM. If F(c), G(o) are U-integrable on oy and if there exists Ao such 
that {o;} =A 2Ao implies the existence of rs such that, for r=, it is true that 


then 


f G(ae) | J + v|. 


5.9. CoroLLary. If F(a), G(o) satisfy the conditions of Theorem 5.8 and 
if in addition F(a) is § U-integrable, then [,,G(do) — {.,F(do) 


5.10. THEOREM. If F(c) is U-integrable on ao, then so also is F(e).1 and to 
the same value. 


For every VEU, there exists Ay such that, if {ox} =A2Ay, then there 
exists 74 for which x 274 implies the equality of {,,F(do), )..F(o0(\o.) within 
V. Since F(o)eCF(o)+V’ for arbitrary V’'EU and a, it follows that 
(oo (a0 + V. Therefore [,,F(do), are equal 
within 2V. Since V is arbitrary, the desired result follows by definition. 


5.11. THEorem. If F(c), G(o) are U-integrable on oy and if, for every goo, 
1, then [.,G(do) (do). 


In the present case the conditions of Theorem 5.8 hold for every VEU; 


therefore 
G(do F (do V/2 F(do V 


for every VEU. Since {,,F(da) is closed, [,,G(do) Cf.,F(dc). 


5.12. CoroLiary. If F(o) is § U-integrable on oo and if, for cao, G(c) 
C F(e)c1, then G(a) is U-integrable on oo and [,,G(do) = (do). 


6. Differential equivalence. The results of this section parallel similar re- 
sults obtained by Kolmogoroff for the case of ¥ the real numbers. The follow- 
ing definition of differential equivalence is a direct generalization of the Kol- 
mogoroff definition [7, p. 666]. 


6.1. DEFINITION. The functions F(a), G(c) are said to be differentially 
equivalent (we write d.e.) on the set oo provided for every VEU, there exists a 


506 C. E. RICKART : [November 


Ay such that, if A2Ay, then J(F, oo, A), J(G, oo, A) are summably equal within 
V. (See Definition 2.1 above.) 


6.2. THEOREM. If F(a), G(c) are d.e. on ao, then the U-integrability of either 
function on oo implies the U-integrability of the other and to the same value. 


Suppose F(c) U-integrable on oo. Evidently, for every VEU, there exists 
Ay such that A2Ay implies that J(F, oo, 4) is u.s. to [,,F(do) with respect to 
V and that J(F, ao, A), J(G, oo, A) are summably equal within V. From this 
it follows that J(G, oo, A) is u.s. to {,,F(do) with respect to 2V for A2Ay. 
Since V is arbitrary G(¢) is U-integrable to {,,F(do) by definition. 


6.3. THEOREM. If F(a), G(o) are U-integrable on oo to the same value, then 
they are d.e. on Go. 


6.4. CoROLLARY. If F(a) is U-integrable on every Gao, then F(o) and 
are d.e. on oo. 


In view of the preceding results, one can characterize the (indefinite) 
U-integral in terms of differential equivalence. 


6.5. THEOREM. In order that a function I(a) be the (indefinite) U-integral 
of a given function F(a), it is both necessary and sufficient that it be closed (that 
is, I(a) =I()1), completely additive in the sense of Theorem 5.5 and d.e. to 
F(a) on eacha. 


6.6. COROLLARY. In order that a single-valued function I(c) be the (indefi- 
nite) § U-integral of a given function F(a), it is both necessary and sufficient that 
it be completely additive in the ordinary sense and d.e. to F(a) on each ao. 


7. Transformation of an integrable function. We now introduce a general 
type of linear transformation T(X) defined on subsets of ¥ and whose values 
are sets in a similar space 9). The topology on Y will be given by the system 
of sets U individual elements of which will be denoted by U. T(X) will be 
subject to the following three conditions: 

(1) implies T(X1) CT(X2). 

(2) T(X) is linear, that is, T(a1X1+a2X2) =a1T(X1) +027 (X2). 

(3) T(X) is continuous in the sense that UC U implies the existence of a 
Vu €UVsuch that T( Vy) CU. 

The class of transformations described above contains as a.special case 
the ordinary linear continuous point transformations T(x) on % to 9), where 
T(X) = {T(x)|xEX}. It also contains the operation of forming the convex 
C.X of a set X and the operation of forming the “generalized convex” C*(X) 
of a set .using the bounded generalized convex operators of G. B. Price [13, 
p. 7]. Observe that in these last two instances ¥ = 9) and the transformations 
have the additional property of leaving individual points invariant. 


Jf 
: 
4 
4 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 507 


7.1. THEorem. If F(o) is U-integrable on ao, then the function T(F(c)) is 
U-integrable on oy and {,,7(F(do)) =T [f«,F(do) Jet. 


Since 7(X) is continuous, for arbitrary UC U there exists VyEVU such 
that T(Vv)CU. Also, since F(c) is U-integrable on oo, there exists Ay such 
that, if {o;} =A2Av, then there exists + for which 274 implies 


C f F(do) + Vou, f F(ds) C + Ve. 


Application of T to these relations and use of (1), (2) give 


C r| f F(de) | + U, 


r| F(de) | T(F(oo + U. 


Therefore J(T(F), a, A) is u.s. to T[{,,F(do) | with respect to U, which com- 
pletes the proof. 


7.2. COROLLARY. If F(o) is § U-integrable on ao, then(**) T(F(c)) is § U- 
integrable on and {,,F(T(do)) =T [f.,F(do) |. 


7.3. COROLLARY. If = and single elements are invariant under T, then 
S$ U-integrability of F(c) implies that of T(F(c)) and to the same value; that is, 
(F(do)) = f.,F(do). 


8. The § U-integral in a complete space. It will be recalled that the defini- 
tion of the U- and § U-integrals (Definition 4.1) involves an assumption con- 
cerning the existence of the value of the integral in the space. This is, in part, 
necessitated by a lack of completeness in the space %. It is the purpose of this 
section to show that the existence assumption can be dropped in the case of 
the § U-integral provided the space ¥ is complete relative to U (see §1 above). 


8.1. DEFINITION. The function F(a) is said to be conditionally § U-integra- 
ble on oo if, for every VEU, there exists a Ay., so that {oi} =A/>Ay., implies 
the existence of independent 3(A‘) (j=1, 2) for which it is true that ),F(a0(\o} 
(oo \o2) C V whenever (j =1, 2). 


8.2. THEOREM. If F(c) is § U-integrable on ao, then F(c) is conditionally 
U-integrable on oo. 


8.3. THEOREM. If ¥ is complete relative to U and F(c) is conditionally 
S U-integrable on oo, then F(c) is § U-integrable on ao. 


(#8) Observe that the continuity of T(X) implies that single elements are carried into single 
elements; that is, T(X) induces a linear continuous point transformation on % to 9). 


508 C. E. RICKART ‘ [November 


Let Av.,= {ow} be the subdivision and +(Ay.,) =7v the associated set of 
positive integers given by Definition 8.1. Denote by xy a particular one of the 
elements in the set Dx, F(ooow); then, by Definition 8.1, J(F, oo, A) is u.s. 
to xy with respect to V for every A2Ay.,. We now prove that {xv} is a funda- 
mental U-directed set. 

Let VEU be arbitrary and consider any pair of elements Vi, V2€U such 
that V;C V/2 (¢=1, 2). For A2Avy,.Av,., we have that J(F, oo, A) is u.s. to 
xy, with respect to V; and to xy, with respect to V2. It follows directly from 
this result that xy, —xv,€ Vit V2C V and, hence, that {xy} is a fundamental 
U-directed set. Let xo be the limit of this set. It remains to show that F(c) is 
§ U-integrable on a» to the value xo. 

For arbitrary VEWV first chose VoCV/2 such that V’CV>o implies 
+ {xy-—x9}€V/2 and then choose Ay,., according to Definition 8.1. Then, 
if A >Avy,ye,, J(F, 0, A) is u.s. to xy, with respect to Vo. But + {xv;—x0} EV/2; 
therefore J(F, 7, A) is u.s. to x9 with respect to V; that is, F(c) isS U-integra- 
ble on go to the value xo. 

The proof of the following lemma, though not difficult, is somewhat long; 
so will be omitted. 


8.4. LEMMA. Conditional § U-integrability on M implies conditional § U- 
integrability on every a. 


Combining Lemma 8.4 with Theorem 8.3 we have 


8.5. THEOREM("*). If ¥ is complete relative to U and F(c) is conditionally 
S U-integrable on M, then F(c) is § U-integrable on every oc. 


9. A convergence theorem for the § U-integral. We consider only the 
§ U-integral in this section and restrict attention to integrable functions F(c) 
for which {,F(da) is absolutely continuous relative to a given, positive, com- 
pletely additive measure function m(c). In view of Theorem 3.1, we could 
replace the above restriction by the stronger condition that m(c) =0 imply 
F(o) 

The following definition gives a generalization of the notion of approxi- 
mate convergence(”*) to functions F(a) of the type being considered here. It 
is also a generalization of a much stronger type of convergence used by Kol- 
mogoroff [7, p. 665]. 


9.1. DEFINITION. A sequence of functions { F,(a) } is said to converge ap- 
proximately to F(a) relative to m(a) provided, for every integer n and VEU, 
there exists a a(n, V)EM and a subdivision A,yv such that, for each V, 


(%) Compare with Phillips’ Theorem 4.1 [12, p. 122]. Observe that “completeness with 
respect to D” used by Phillips implies completeness relative to ) (they are, in fact, equivalent; 
see Footnote 7). 


(2°) See Definition 12.3 and Theorem 12.4 below. 


i 
3 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 509 


lim,.. m(a(n, V))=0 and, for AZAny, it is true that J(F,, 4), J(F, A) 
are summably equal within V for every >< M(\Co(n, V). 


Observe that, if A={o,;} and J(F,, 0, A), J(F, , A) are summably 
equal within V for arbitrary e©M(\Co(n, V), then J(F,, o(\Uzo;, A), 
J(F, of\U,0;, 4) are summably equal within V for arbitrary x and 
MN\Co(n, V). It follows immediately that are 
equal within V for arbitrary and e©M/\Co(n, V). We thus obtain a re- 
sult which is somewhat stronger than summable equality. 

The proof of the next theorem will be omitted, since it is essentially con- 
tained in the first part of the proof of Theorem 9.5 below. 


9.2. THEOREM. Let F,(0) be § U-integrable on every o and let {,F,(do) be 
absolutely continuous relative to m(a) (n=0, 1, 2,---). Then, if F,(o) converges 
approximately to Fo(c) relative to m(c), the following are equivalent: 

(i) feFn(do) = {.Fo(do) uniformly in o. 

(ii) [.Fa(do) are equi-absolutely continuous relative to m(c). 

9.3. DEFINITION. The function F(c) is said to be § U-integrable uniformly 
in o provided F(a) is § U-integrable on every o and, for each VE, there exists 
Ay independent of o such that, if A2Ay, then J(F,o, A) is u.s. to {,F(do) with 
respect to V uniformly in o. 


9.4. Lemma. Let F(c) be § U-integrable uniformly in o and let oo be such 
that + fene,F(do) € V for all o. Then there exists Ay such that {o;} =A>Ay im- 


plies +> C2V for arbitrary x’. 


From the definition of uniform § U-integrability, there exists Ay such that 
{o;} =A 2Ay implies that J(F, ¢, A) is u.s. to {,F(do) with respect to V uni- 
formly in o, that is, there exists 74 independent of o such that, if 274, 


Now let 2’ be arbitrary and set \(U,-0;), in (1). Since 
+ fane,F(do) E V for all o, this completes the proof. 


9.5. THEOREM. Let % be sequentially complete, F,(0) § U-integrable uni- 
formly ino, and [,F (do) absolutely continuous relative to m(c) (n=1, 2, ). 
Then, if { F,(0)} converges approximately to F(a) relative to m(a), the following 
are equivalent : 

(i) F(o) isS U-integrable uniformly in o and limy.« = f,F(do) uni- 
formly in 

(ii) [oF n(do) exists for every 

(iii) {.Fa(do) are equi-absolutely continuous relative to m(c). 


That (i) implies (ii) is trivial and (ii) implies (iii) by Theorem 3.2. We 
prove that (iii) implies (i). 


510 C. E. RICKART . [November 


Let VEU be arbitrary and set ¢mn=a(m, V)Uo(n, V), where o(m, V), 
a(n, V) are given by Definition 9.1. If AZA,v-A,y, where Any, Any are given 
by Definition 9.1, and if M(\Coma, then J(Fn, A), J(F,, 7, 4) are each 
summably equal to J(F, a, A) within V. It follows that J(Fn, 0,4), J( Fa, o, 4) 
are summably equal within 2V. Now an application of Corollary 5.9 gives 


f F,,(do) — f F, (do) € (2V)er C 3V. 
oNComn oNComn 


This holds for arbitrary o and all m, n. Using (iii) ,;we obtain my such that 


m, n=ny implies + fone,,/(do)€ V for arbitrary o and all k. Therefore, if 
m,u= Ny, 


f F,(do) — f F,(do) € 3V + 2V C OV, 


where @ is arbitrary and my obviously does not depend on a. It follows that 
{ {.Fn(do) } is a fundamental sequence uniformly in . Since % is sequentially 
complete, there exists I(¢)€% such that lim,... uniformly 
in o. It remains to show that F(c) is §U-integrable uniformly in ¢ to the 
value I(c). 

Let VEU be arbitrary and select a subsequence of the F,(¢), which we 
continue to denote by { F,(¢)}, having the following two properties: 

(a) +{f.Fi(do) —I(c) } €2-V for all o. 

(b) There exist 7,€M such that(?") m(r7,) <6(2-"-*V), ta DTa41, and also 
there exist A,y such that A2A,y implies J(F,, 0, A), J(F, ¢, A) are summably 
equal within +2-*-*V for all 

Set 0? = and 08 =7,_:/\Cr, for n= 2, and consider the subdivision 
A®= {a2}. Since m(ao8) < 5(2-"-* V) (for 22), it follows that + 
€2-*-*V for arbitrary ¢. Hence, by Lemma 9.4, there exists A*2>A°A,y such 
that {o;} =A2A* implies 


+ ono) CV/2™, 
for arbitrary 7’, 0 and 22. Out of property (b) and the remark following 


Definition 9.1, it follows that are equal 
within + 2-*-?V and, hence, that 


+ CV/2"". 


This result holds for arbitrary n22, 0, and {o;} =A2A*. 
Now define A! such that A2A! implies J(F, ¢, A) u.s. to {,Fi(do) with re- 
spect to 2-*V uniformly in ¢, and let Ay be the sum of the subdivisions A* 


8(eV) >0 is chosen so that, if m(o)<8(eV), then +/,Fs(do)EeV for all k. 


3 
4 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 511 


(n=1, 2, ---) over the subdivision A® (see §4 above). For {o;} =A 2Ay and 
arbitrary 7’, 7, we have 


(2) = +d V/2"" Cv/2, 
x’ n=2 n=s2 

where N,, is the largest for which 0. 

If ADAy, then J(F;, A) is us. to with respect to 2-*V 
uniformly in ¢. Moreover, since m(r1) <6(2-*V), it follows that +/, enr,Fi(do) 
€2-5V and, hence, that J( Fi, oo}, A) is u.s. to {,Fi(do) with respect to 2-*V 
uniformly in ¢. Applying (a), we obtain J(Fi, a(\o?, A) u.s. to I(¢) with re- 
spect to V/2 and, applying (b) again, we have J(F, o(\o, A) u.s. to I(c) with 
respect to V/2 uniformly in o. From this last result and (2) it follows that 
J(F, ¢@, 4) is u.s. to I(¢) with respect to V uniformly in ¢, which completes 
the proof of Theorem 9.5. — 


Part III. INTEGRATION WITH RESPECT TO A “BILINEAR” FUNCTION(?2) 


10. The Uz-integral. Let %, as usual, be a convex linear topological space 
and let 2) be simply a linear space. We introduce a “bilinear” function B[y, o] 
subject to the following four conditions: 

B1. For every yEY and(*) cEM, Bly, o] is a unique element of %. 

B2. Bly, o] is linear (not necessarily continuous) in y for each a; that is, 


=a1B[y1, 7] +-02B [y2, 


B3. For each y, Bly, a] is a completely additive function of a. 
B4. There exists a real number B=1 such that(™) 


™ m 
BY: 0:1] CV implies > > BIY,, oi] C BV, 
j=l 

where Y;CY, aif \o;=0 (ij), =0 (j#k). 

The functions y(c) to be considered in this part will be multi-valued and 
defined on I to J). They will be subject without exception to the restriction 
that o:Co: shall imply y(o1) Cy(o2). Such functions are described as contrac- 
tive(*). If A={o;} is an arbitrary subdivision, the sequence of sets 
{ B[y(oNo,), will be denoted by the symbol Ja(y, ¢, A). 


10.1. DEFINITION. y(c) is said to be Us-integrable on oo provided there 
exists an element In(y, oo) GX such that, for every VEU, there exists a Ay for 
which Jp(y, oo, Acv) ts u.s. to Ip(y, oo) with respect to V. Ip(y, oo) is the value 
of the integral and we write In(y, 00) do]. 


(#) The general idea of considering integration with respect to a “bilinear” function was 
suggested by T. H. Hildebrandt. This paper represents a development from that idea. 

(8) See Footnote 14 above. 

(*) If YCQ, then BLY, o] = {B[y, ]| ye Y}. 

(#5) Observe that, if y(¢) is a point function, the associated set function y(o) = {y(¢)|sEo} 
is contractive. 


512 C. E. RICKART . [November 


Observe that this definition is weaker in form than Definition 4.1, be- 
cause the assumption here is that Ja(y, oo, A) be u.s. to Ia(y, oo) only for 
A=A,,yv rather than A2A,,y. The weakening of the definition of integrability 
is balanced by the conditions on B[y, ¢]. Definition 10.1 is essentially that 
used by R. S. Phillips [12, p. 118] and the integral obtained here will be seen 
to reduce to his as a special case (Theorem 15.3). 


10.2. Lemma. Let Y;CY, oi\o;=0 (647), 0: 0}, =0 
then 


+ {4 > CF 
i=] 
implies the existence of a mo independent of x such that, if r;=7o, 


+ + > BIY,, C 


jE 


We have immediately that 


> B[Y; — C 2V; 
t=1 


hence, by condition B4, 


> — + aly. C 26v, 


im1 


where the 7; are completely arbitrary. This can be written in the form 
i=1 iE i=1 
(1) : 
i=1 


This gives 
t=1 t=1 
It follows by the hypothesis of the lemma that 
@ + {24d alr. ot] + u cor 


Now let y; be some particular element of Y;; then from condition B3 it is 
evident that there exists a 1» such that 7;270 implies 


(3) + Daly u ob ev. 


t—1 


A 

4 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 


Relations (2) and (3) together give 
jor; 
which completes the proof. 


10.3. Lemma. If Ja(y, o, Ao) is u.s. to Ip(y, 0) with respect to V, then 
Ja(y, a, A) is u.s. to Ip(y, with respect to 7BV for all 


There is no loss in taking ¢= M. Let Ao= {o;} ; then by hypothesis there 
exists 79 such that 7 27» gives 


(1) + { Bly(od, Cv. 


Consider =A2Ao, where o;=U,2,0}. By Lemma 10.2 there exists 
such that 7;=7¢ implies 


(2) Bly(o1), 01] — Ia(y, uy} C 


Now let vo denote those integer pairs (i, 7) for which iG a» and jEm/, and 
consider any finite set of integer pairs y which contains vo; that is, y= vo. Set 
v’ = { (4, j)| G, Ev, and v’’ = { (i, 7) Ev, It follows from 
(2) that 


Moreover, from (1) we have 

(4) + Bly(oi), C 2V, 

for arbitrary 7’ such that 2’(\r9=0. Because of the arbitrary character of 7’ 
in (4), we can write 

(5) + Bly(oi) U8, C 2V. 


Now let = {i| (4, 7) for some j}, = {j| @, 7) €v’’} and apply B4 to 
(5) to obtain 


+ { Vs, oi] + a| Ue, U Al C 


From this it follows that 


(6) +2 B[y(ox), C 28V. 


514 C. E. RICKART 


Combining (3) and (6) we obtain 


As a consequence of Lemma 10.3 we have 


10.4. THEOREM. Us-integrability of is equivalent to U-integrability of 
F(c) =B[y(¢), 


Investigation of the form in which the basic properties of the §U-integral 
appear in the special case of the Us-integral will be left to the reader. 

The following lemma is in preparation for the proof that Us-integrability 
of y(a) for every o implies § U-integrability of F(¢) uniformly in o. 


10.5. Lemma. If for a given A= {ox} there exists my such that m;27, 
(i=1, 2) implies 


(1) > B[y(o;), oi] > B[y(o;), V, 


then for arbitrary o and 7:27, (i=1, 2) 


Taking 71=72=7,4, we obtain from (1) 


Dd Bly(oi) — 01] CV. 


An application of B4 gives 
X {Bly(o) — yo), — Bly(os) — Col} C BV. 


Since y(o;) —y(o;), it follows that 
(2) Bly N — Bly 0), C BV. 


Again from (1) we have for arbitrary 7’(\1, =0 
+ Bly(os), CV, 
which, because of the arbitrary character of 7’, implies (as in the proof ‘of 
Lemma 10.3) 
+ Bly(o C BV. 
Combining this last result with (2) completes the proof of the lemma. 


10.6. THEOREM. Uz-integrability of y(a) on every o is equivalent to §U-inte- 
grability of F(a) =B[y(c), uniformly in o. 


[November 

i 
¥ 
4 


1942] INTEGRATION IN A TOPOLOGICAL SPACE , $15 


Because of Theorem 10.4, for every VE there exists Ay such that, if 
{o;} =A 2Ay, then there exists 7, for which 7 274 implies 


{ >> B[y(o), oi] -f Bly, V/2. 
M 
Therefore, if 7; 27,4 (¢=1, 2), 
Bly(o.), — Bly(os), CV. 
From Lemma 10.5 it follows that 


(1) + { Bly(o No), — No), oA C 


for arbitrary o and 7;27, (4=1, 2). Now, for a particular o choose A,y such 
that A’2A,y implies Ja(y, 7, A’) u.s. to [.Bly, do] with respect to V. Let 
} =A’ =AA,y, then it follows that there exists such that, if r’ 274, 


(2) + of), f Bly CV. 


Now in (1) choose 7227, such that U,,40/ CU,,0;. Then, since A’2A, we 
can apply Lemma 10.2 to (1) and obtain the existence of a rf 274 such that 


+ — > Bly(o Nol), it C 166°V. 


This result with (2) yields 
+ { Bly(o f 2b. ae] C 186*V, 


for arbitrary 274. Since 74 does not depend on g, the proof is complete. 

11. The Uz-integral in a complete space. It has already been observed 
that the definition of the Uzg-integral is weaker in form than the definition 
of the § U-integral. Similarly, conditional Uz-integrability can be defined in a 
weaker form than conditional § U-integrability. 


11.1. DEFINITION. y(c) is said to be conditionally Us-integrable on oo pro- 
vided for every VEU there exists and such that, if Ti ov 
(i=1, 2), then 


LD 01), 060 01] — Bl 04), CV. 


The following theorem follows easily from Lemma 10.2. 


11.2. THEOREM. Conditional Ug-integrability of y(a) is equivalent to condi- 
tional § U-integrability of F(¢)=B[y(o), 


516 C. E. RICKART : [November 


Out of Theorems 11.2 and 8.5 we obtain 


11.3. THEOREM. If % is complete relative to U and y(a) is conditionally 
Us-integrable on M, then y(c) is Us-integrable on every a. 


12. Absolute continuity and a convergence theorem for the U;-integral. 
In this section we assume given a positive completely additive measure func- 
tion m(c) defined over Mt such that m(¢)=0 implies Bly, ¢]=6@ for every 
y€. We have immediately that m(c) =0 implies /,B[y, do] =0, where y(c) 
is arbitrary. This remark plus Theorem 10.4, Corollary 5.6 and Theorem 3.1 
enables us to state 


12.1. THEOREM. If y(c) is Us-integrable on every a, then [,B[y, do] is com- 
pletely additive and absolutely continuous relative to m(c). 


Throughout this and the following section we assume a topology(*) on 
the space 9) given by the system of neighborhoods V individual elements of 
which will be denoted by U. Also B[y, a] will be subject to the following con- 
dition in addition to B1-B4. 

BS. Bly, M] is continuous on 9) to X; that is, for every VEU there exists 
UvE U such that B[Uy, M]CV. 


12.2. THEoreEM. If B[y, o] satisfies B1-B5, then B[y, o] is continuous for 
each o and uniformly in oa. 


Let VEU and choose Uy such that B[Uy, M]CV/f. Applying B4 we get 
B[Uv, ¢]+B[Uy, MO\Co]CV, where is arbitrary. Since Uy, we have 
B[Uv,o]CV. But Uy does not depend on ¢; hence B[y, a] is continuous uni- 
formly in ¢. 


12.3. DEFINITION. The sequence of functions {y,(o)} is said to converge ap- 
proximately(?") to relative to m(c) provided, for every n,. U, there exists 
o(n, U) EM such that lim,...m(a(n, U)) =O for each Uand fora U) 
it is true that the sets yn(a), y(o) are equal within U. 


12.4. THEOREM. If y,(0) converges approximately to y(a) according to Defini- 
tion 12.3, then F,(¢) =B[yn(c), 0] converges approximately to =B[y(c), 
according to Definition 9.1. 


Given VEU, because of Theorem 12.2 we can choose Uy such that 
B[Uv,o]C V/B for every a. It follows immediately from B4 that Uy, o;] 
CV for arbitrary disjoint o;. Now, if ¢;©M\Co(n, Uy), then 


yn(oi) + Uy. 


Applying B[y, o;] to this relation and adding, we obtain 


(%) 9) is not necessarily assumed to be convex. 
(27) This definition is a generalization of the one used by Phillips [12, p. 125]. 


2 

‘ 

4 


INTEGRATION IN A TOPOLOGICAL SPACE 


B[yn(o4), oi] Bl + Vz. 


In a similar manner we obtain 
dX Bly(ox), C Blyn(os), + V. 
t=1 


It follows from these relations that Js(yn, ¢, A), Ja(y, ¢, A) are summably 
equal within V for every A and M/\Co(n, Uy). Since lim,.,, m(a(n, Uv)) 
=0, the proof is complete. 

Collecting the results of Theorems 10.6, 12.4, 9.5, we can state 


12.5. THEOREM. Let ¥ be sequentially complete, y,(a) be Us-integrable on 
each o (n=1,2,---), and y,(a) converge approximately to y(a). Then the fol- 
lowing are equivalent: 

(i) y(o) is Up-integrable on each o and lim,.. do do] uni- 
formly in 

(ii) do] exists for each o. 

(iii) {.Blyn, do] are equi-absolutely continuous relative to m(c). 


13. Measurable functions. An existence theorem for the U;-integral. The 
following definition of measurability for functions y(c) of the type being con- 
sidered here is a generalization of a definition given by Price [13, p. 25] for 
single-valued point functions with values in a Banach space. We are inter- 
ested here only in generalizing Theorem (16.1) of [13] to the Us-integral, 
where B[y, o] is subject to all of the five conditions B1-B5. 


13.1. DEFINITION. The function y(c) is said to be measurable (M) on the set 
Oo provided, for every set Y dense in y(ao), yE Y and UE'U implies the existence 
of EM such that y(oy) Cy+ U and such that o9=Uyerow. 


The next definition gives a generalization of a familiar condition fre- 
quently imposed on Lebesgue measurable functions to insure the existence of 
a finite Lebesgue integral. 


13.2. DEFINITION. The function y(c) is said to be B-summable provided, for 
every VEU, there exists Ay such that, if {o;} =AZAy, then there exists rs for 
which implies +>_-B[y(o,), CV. 


Observe that, if y(o) is B-summable, then Jz(y, M, A) is us. to 
>-,B[y(o.), 04] with respect to V for A2Ay. Also it can be proved that 
y(c) is B-summable if there exists Ay with the property that, for each A2Ay, 
there is a bounded(**) set I, such that Js(y, M, A) is u.s. to I, with respect 
to V. 


(#8) A set X CX is said to be bounded provided, for every VE VU, there exists a >0 such that 
X CaV [11]. 


1942] 517 


518 C. E. RICKART : [November 


A function y(¢) is said to be separable provided the set y(M) is separable. 
Almost separable (B) will mean that there exists a set oo of B-measure zero 
(that is(®), B[y, oo] =0 for every yEY) such that y(M\Cao) is separable 
[13, p. 25]. 


13.3. THEOREM. Let ¥ be complete relative to U and let y(a) be B-summable, 
almost separable (B) and measurable (MN) on the set M(\Coo, where oo is the 
set of B-measure zero such that y(M(\Cao) is separable. Then y(c) is Up-integra- 
ble on each oc. 


It may as well be assumed at the outset that y(c) is separable. Also, in 
view of Theorem 11.3, it will be sufficient to prove y(c) conditionally Uz,-in- 
tegrable on M. 

Let VEU be arbitrary and choose Uy€ U so that B[ Uy, ¢] CV /8 for all 
o. Then, for arbitrary disjoint o;, it follows by condition B4 that 


(1) > B[Uy, CV. 


Since y(c) is measurable there exists such that y(o2)Cyn+ UV’ 
and M=U,_,02, where {yn} is the separating sequence for y(M) and U’E U 
is chosen so that +2U’CU. Let Ay be the subdivision given by Definition 
13.2 and choose {a;} =Ay 2Ay such that each o; is contained in one of the 
sets 0°. Observe that y(o;) —y(o;:) C Uy for every 7. 

Since {o;} 2Ay, there exists ry such that x’//\y =0 implies +>>,-B[y(o,), 
o:]CV. For arbitrary (i=1, 2) set where =0. Then 


Bly(o:), — Bly(os), o:] 
= — y(o:), + B[y(o,), o:] — B[y(o:), 
+V+VCV4+V4+V C4, 


Ty 


where the next to last inclusion follows by (1). Since V is arbitrary, this com- 
pletes the proof. 

If condition B3 is strengthened so that B[y, 7] is completely additive 
uniformly for yGU, where U is an arbitrary element of U (we say that 
Bly, o] is uniformly completely additive), then we can prove that bounded- 
ness(*°) of y(a) implies B-summability and thus obtain the following theorem. 


13.4. THEOREM. Let ¥ be complete relative to U and let B[y, «| be uniformly 
(9) It is easy to prove, using B1-B4, that B[y, oo] =0 for every implies Bly, =0 


for every Coo and 
() The function y(c) is said to be bounded provided the set y(M) is bounded. 


= 
A 
‘ 
¥ 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 519 


completely additive. Then, if y(o) is bounded, almost separable (B) and measur- 
able (MN) on the set M(\Coo, it follows that y(c) is Us-integrable on each c. 


Part IV. RELATION TO OTHER INTEGRALS 
14. The Kolmogoroff integral. The following theorem is a direct conse- 
quence of definitions; therefore the proof will be omitted. 
14.1. THEOREM. If ¥ is taken to be the real numbers, then the S U-integral 
includes(*") the Kolmogoroff single-valued integral [17, p. 663]. 


15. The Phillips integral. Consider the special “bilinear” functions m(c)x, 
where m(c) is a completely additive positive measure function over J and x 
is an element of the convex linear topological space %. 


15.1. Lemma. Let X, YCR, o = os \0;=0 (ij). Then Y+m(o)X 
CV.1, where VEU, implies XC Ver. 


We have m(o,) Y+m(c)m(o;)X Cm(o;) Ver (¢=1, - - - , m). Summing these 
relations over 7 gives 


> + m(c) > m(o;)V 


But V.. is convex; therefore Ve. Moreover m(c)Y 
Cy hence provided m(c) 0. Since the 
lemma is obviously true if m(¢) =0, this completes the proof. 


15.2. THEOREM. The “bilinear” function B|x, o]=m/(c)x satisfies all of the 
conditions B1-B5 (including uniform complete additivity); therefore the entire 
theory of the Ug-integral applies. 


All of the conditions are obviously satisfied except B4 which follows di- 
rectly from Lemma 15.1. Observe that in the present case the constant of B4 
can be taken as 1+e, where e>0 is arbitrary. 

Definition 10.1 of the Uzg-integral reduces in this case to precisely the 
definition used by Phillips [12, p. 118]; therefore 


15.3. THEOREM. For the case B[x, 0] =m(o)x, the Us-integral reduces to the 
Phillips integral. 


16. The Price integral. Consider the special “bilinear” function 7(c)x, 
where x is an element of a Banach space ¥ with its norm topology of spheres 
having center 0 and where 7(c) is a linear continuous transformation of % 
into itself. r(o)x means the result of transforming x by r(¢). G. B. Price has 


(#) One integral notion is said to include a second provided every function integrable ac- 
cording to the second notion is also integrable according to the first and to the same value. 


‘ 


520 C. E. RICKART : [November 


defined an integral for this situation by first subjecting 7(¢) to the following 
conditions(**) [13, properties (8.1)—(8.3) ]. 

T1. If r() is the identically zero transformation, then o'Co implies that 
t(o’) is also identically zero. 

T2. If r(c) is not the identically zéro transformation, then it has a continuous 
inverse 

T3. For every sequence {a,} of disjoint elements of Mt, r(Uon)=>.1(02), 
where the series is unconditionally convergent according to the norm topology in 
the space of transformations. 

T4. The generalized convex operator C* generated by r(c) is bounded [13, 
pp. 7-10]. The bound will be denoted by B’. 


16.1. THEOREM. If r(c) satisfies T1-T4, then the “bilinear” function 
B[x, ¢]=7(c)x satisfies conditions B1-B5 (including uniform complete addi- 
tivity). Therefore the entire theory of the Ug-integral applies. 


All of the conditions are obviously satisfied except B4 for which we make 
the following proof. 

Let X;C%, \o;=0 (ij), of =0 (jk), and assume 
>" r(o,)X:CV;, where V, is a sphere of radius r and center 0€X. The 
thing to be proved is that(#) 5°71) %,7(of)X:CB’(V;)e. In view of T1, we 
can evidently assume 7(¢;) not identically zero (i=1, - - - , m). For simplicity 
let uj then 


m 


jul 


Moreover, since )_74,uj=I, where I is the identity transformation, we have 


‘the following 


im=1 


But {uit - - pm} is obviously an element of the generalized convex operator 

The following theorem is an easy consequence of Theorems (11. 4) and > 
(4.11) of Price’s paper. 


(#) Condition T1 was not stated explicitly by Price but is implicit in the proof of part 6.9 
of his Theorem 6.4. Also the situation discussed here is a bit more restricted than that considered 
by Price, since we require t(o) to be defined for every c€M while Price admitted certain sets 
with “infinite measure” (see Footnote 14 above). 

(3) Observe that the constant 6 of condition B4 can then be taken as 6’+e, where e>0 is 
arbitrary. 


4% 


1942] INTEGRATION IN A TOPOLOGICAL SPACE 521 


16.2. TuEorem. If B[x, then the Us-integral includes the Price 
integral. 


17. Open questions. Is it possible to dispense throughout with the condi- 
tion that the space ¥ be convex? 

Kolmogoroff [7'] has also given a definition of a multi-valued integral for 
real functions. What is the precise relationship of this Kolmogoroff integral 
to the U-integral when % is the real numbers? 

Is the specialization of the Us-integral which includes the Price integral 
actually equivalent to it? 

Is condition T4 on r(¢) equivalent, in the presence of T1-T3, to condition 
B4 on B[x, 


BIBLIOGRAPHY 


1. S. Banach, Théorie des Opérations Linéaires, Monografje Matematyczne, Warsaw, 1934, 

2. Garrett Birkhoff, Moore-Smith convergence in general topology, Annals of Mathematics, 
(2), vol. 38 (1937), pp. 39-56. 

3. Nelson Dunford, Uniformity in linear spaces, these Transactions, vol. 44 (1938), pp. 
305-356. 

4, L. M. Graves, On the completing of a Hausdorff space, Annals of Mathematics, (2), vol. 38 
(1937), pp. 61-64. 

5. F. Hausdorff, Mengenlehre, Berlin, 1927, 

6. T. H. Hildebrandt, On unconditional convergence in normed vector spaces, Bulletin of the 
American Mathematical Society, vol. 46 (1940), pp. 957-962. 

7. A. Kolmogoroff, Untersuchungen tuber den Integralbegriff, Mathematische Annalen, vol. 
103 (1930), pp. 654-696, 

8. , Zur Normierbarkeit eines allgemeinen topologischen linearen Raumes, Studia 
Mathematica, vol. 5 (1934), pp. 29-33. 

9. K. Kunisawa, Some theorems on abstractly-valued functions in an abstract space, Proceed- 
ings of the Imperial Academy, Tokyo, vol. 16 (1940), pp. 68-72. 

10. E. H. Moore and H. L. Smith, A general theory of limits, American Journal of Mathe- 
matics, vol. 44 (1922), pp. 102-121. 

11. J. von Neumann, On complete topological spaces, these Transactions, vol. 37 (1935), 
pp. 1-20. 

12. R.S. Phillips, Integration in a convex linear topological space, these Transactions, vol. 47 
(1940), pp. 114-145. 

13. G. B. Price, The theory of integration, these Transactions, vol. 47 (1940), pp. 1-50. 

14. S. Saks, Addition to the note on some functionals, these Transactions, vol. 35 (1933), 
p. 967. 

15. ———, Theory of the Integral, Monografje Matematyczne, Warsaw, 1937. 

16. J. V. Wehausen, Transformations in linear topological spaces, Duke Mathematical Jour- 
nal, vol. 4 (1938), pp. 157-169. 


UNIVERSITY OF MICHIGAN 
Ann Arsor, Mica. 


THE RESTRICTED PROBLEM OF THREE BODIES 


BY 
MONROE H. MARTIN 


The difficulty of the problem of three bodies led Jacobi(*) to introduce a 
simplifying assumption, designed to make the problem more amenable to 
mathematical attack, but such that the problem retains its astronomical sig- 
nificance. In the restricted problem of three bodies Jacobi postulated that 
two masses known as finite masses revolve perpetually in concentric circles 
about their common center of mass in accordance with the laws of the two 
body problem and required the motion of a third mass termed the infinitesi- 
mal mass under the assumption that it is attracted by the two finite masses 
according to the Newtonian law of gravitation. 

In this paper the scope of the problem is enlarged by permitting the two 
finite masses to move in accordance with an arbitrarily chosen solution of the 
two body problem. One is immediately led to three types of restricted prob- 
lems according as one finite mass moves in an ellipse(*), parabola or hyperbola 
about the other as focus. As might be expected, considerable simplification 
occurs when the conic section degenerates into a line segment. 

A restricted problem of three bodies may always be reduced to a quast- 
Lagrangian system(?) 

d OL OL OL 


+k = 0, L = qe; 4), k = k(t), 


by the introduction of suitable variables. Such systems reduce to Lagrangian 
systems upon introduction of a Lagrangian function L=e'L, where = fkdt. 
Parts I, II, III of the paper are devoted to a study of these systems, partly 
with a view of their applications to the restricted problem of three bodies 


Presented to the Society in two parts, the first under the title Restricted problems in three 
bodies on December 29, 1938 and the second under the title Quasi-Lagrangian systems on April 
26, 1940; received by the editors June 27, 1941, and, in revised form, March 27, 1942. 

(4) R. Marcolongo, Il problema dei tre corpi, Milan, 1919, p. 97, ascribes the problem to 
Jacobi. Recently the following papers, among others, have appeared on the problem: G. D. 
Birkhoff, Sur le probléme restreint des trois corps, Annali della R. Scuola Normale Superiore di 
Pisa, (2), vol. 4 (1935), pp. 1-40 (first memoir) and (2), vol. 5 (1936), pp. 1-42 (second memoir); 
A. Wintner, Beweis des E. Strmgrenschen dynamischen Abschlussprincips der periodischen Bahn- 
gruppen im restringierten Dretkirperproblem, Mathematische Zeitschrift, vol. 34 (1931), pp. 321- 
349, where further references are given. See also A. Wintner, The Analytical Foundations of 
Celestial Mechanics, Princeton, 1941. 

(?) The restricted problem of elliptic type has been investigated by F. R. Moulton, Periodic 
Orbits, Carnegie Institute of Washington Publication, 1920, pp. 217-284. 

(*) Special cases of these systems have been considered by Elliott. See P. Appell Traité de 
Mécanique Rationelle, Paris, 1902, vol. 1, pp. 582-583, where further references are given. 


522 


4 

4 

ig” 


THE PROBLEM OF THREE BODIES 523 


and partly because of their intrinsic interest as examples of non-conservative 
systems. Among other things, their “limiting motions” are studied; in par- 
ticular, conditions sufficient to insure that a motion tends toward equilibrium 
are obtained. For quasi-conservative (L,=0, k=const.) systems it is possible 
to obtain a generalization of the energy integral and of the principle of least 
action, the latter involving a Mayer calculus of variations problem. Parts 
IV, V, VI deal with the restricted problem of three bodies. In the main they 
are concerned with the behavior of the infinitesimal mass as the two finite 
masses recede to infinity or approach (or leave) a collision along a degenerate 
conic section. 


I. QUASI-LAGRANGIAN SYSTEMS 


1. General principles. A dynamical system with m degrees of freedom and 
equations of motion (*) 


d OL oL 
+s-— = 60, L = + + ¢, k = k(t), 


1 
dt 0g, 94, 


is termed a quasi-Lagrangian system. If k is a positive constant, the quasi- 
Lagrangian system becomes a dissipative system for which Rayleigh’s dis- 
sipation function(®) is proportional to the Lagrangian function. It is readily 
verified that the equations of motion may be given the variational form 


(2) “era =, f 


to 


there being no variation in the time #, nor in the end points of the varied 
curves. Subjected to Legendre’s transformation they take the canonical 
form(®) 


(3) Gr = Hp, = — Hy, — H= (a"*2/) (pr br) (Ds — —«¢, 


the integration of which is equivalent to the determination of a complete 
solution of the partial differential equation 


(4) Sit+ kS + H(é, qr, = 0. 


This partial differential equation accordingly plays the role of the Hamilton- 
Jacobi partial differential equation. 
In place of Liouville’s theorem which likens the flow in the phase space 


(*) All functions are assumed analytic in their arguments. The dot denotes differentiation 
with respect to the time ¢. The matrix ||a,.|| is assumed to be positive definite and the repeated 
indices denote summation from 1 to n. 

(®) For the theory of Rayleigh’s dissipation functions see E. T. Whittacker, A Treatise on 
the Analytical Dynamics of Particles and Rigid Bodies, Cambridge, 1927, pp. 230-231. 

(*) The reciprocal matrix of ||a,.|| is denoted by ||a"|. 


E 


524 M. H. MARTIN ° [November 


of a Lagrangian system to that of an incompressible fluid, the flow in the 
phase space of a quasi-Lagrangian system obeys the law Ve"! =const., where 
V is the 2n-dimensional volume of a portion of the phase space at time ¢. To 
prove this, one notes that the condition(’) for {Md V to be an integral invari- 
ant of (3) yields Me™! =const. 

The rate at which the energy H is changing along a motion of (3) is ex- 
pressed by 


aH /dt », + Hi, 


and the system is termed acquisitive or dissipative in a certain time interval 
according as dH/dt 20 holds in this time interval for every motion of the 
system. The two forms of H important for the restricted problem of three 
bodies turn out to be 
(5) H = (a"/2)(p, — (pe — — 

H = (a"/2)(pr — — — 
where a", b,, ¢ are functions of g, only. One finds 

dH /dt = — ka"*(p, — e~'b,) (ps — 


d(e'H)/dt = — (k/2)e'a"(p, — e~'b,) (pe — 


holding, respectively, so that in both cases the system is acquisitive or dis- 


sipative according as k $0. 

2. Limiting motions. A motion MW defined($). for a<#<b is an w-limiting 
motion of a motion M defined for ¢>fo if, given any subinterval ay St S bo of 
a<t<6, an arbitrarily fixed, small positive number 6, and an arbitrarily fixed, 
large positive number 7, there exists a r>T such that the point P(#) on M 
is at a distance less than 6 from the point P(t—r) on M for ag St—7 So. An 
w-limiting point of M is a point of accumulation of a sequence of points P(¢;) 
on M for which t;++ ©. a-limiting motions and a-limiting points are defined 
similarly for —>— ©. Assuming that the second members in (3) are regular 
analytic in g,, p., ¢ throughout the phase space with the exception of the 
points of a set S at which singularities occur for ¢>¢, a motion defined for 
t>to is positively stable if it remains in a bounded portion of the phase space 
and does not come arbitrarily close to S, otherwise M is positively instable. 
Negatively stable (instable) motions are defined similarly and a motion both 
positively and negatively stable is termed stable. 

If H, H,,, Hp, tend towards limiting functions H, H,,, H,, uniformly in 
any bounded, closed subregion of the phase space not containing points of S 
while & tends toward a finite limit k as ++ ©, the w-limiting motions of a 


(7) See E, T. Whittacker, op. cit., pp. 283-284. 
(8) The possibilities a= — ©, b= + © are not to be excluded. 


' 
é 


1942] THE PROBLEM OF THREE BODIES 


motion M of (3) are(*) motions of the limiting system 
(6) qr br H,, hoy. ; 


In case M is positively stable its w-limiting motions comprise a set of stable 
motions approached uniformly by M as 0. 


THEOREM 1. If the Hamiltonian function H has either form in (5) and if k 
tends toward a finite positive limit k as t+ © the w-limiting motions of a posi- 
tively stable motion M of (3) are equilibrium motions(*) of the system (6) with 
H =(a"p,p,/2) —c, or H=a""*p,p,/2. 


Along M the energy H is eventually a monotone decreasing function of ¢ 
and will accordingly approach a finite limit 4, since M is positively stable. 
Along an w-limiting motion M@ of M we have H=h. To see this let P be any 
point of M and P be the point on M at time t¢. Clearly 


| k| s | WP W(P)| + | WP) + | 


Let ¢ be an arbitrarily small positive number. In view of the uniform conver- 
gence of H towards H we may select f; such that | H(P)—H(P)| <e for all 
t>t,. Since H(P) tends to has a limit, there exists a fz for which | H(P) —h| <e 
for all ¢>¢:. Finally H(P) is a continuous function of P and P is an w-limiting 
point of M. Hence there exists a value of ¢ greater than either 4; or fg for which 
| H(P) —H(P)| <e. It follows that | H(P)—4| is arbitrarily small and there- 
fore equals zero. 

Along M we have dH/dt = —ka"*p,p, =0 and therefore, since ||a’*|| is posi- 
tive definite, the p, are zero on M. Hence M is an equilibrium motion. 

It is obvious that these systems possess no periodic motions. 


II. QUASI-CONSERVATIVE SYSTEMS 


1. General principles. A quasi-Lagrangian system is termed quasi-con- 
servative if L,=0 and & is a constant other than zero. For such systems there 
exists a generalization of the energy integral 


THEOREM 2. If S be defined along a motion M by 


t 
(7) ets = + f e(p,H», — H)dt, 
to 


the quantity (H+kS)e* retains a constant value along M. 
Placing z=S, p= Si, =S,, in (4), the differential equations of the char- 


(%) For the treatment of steady flows, see G. D. Birkhoff, Quelques théorémes sur le mouve- 
ment des systémes dynamiques, Bulletin de la Société Mathématique de France, vol. 40 (1912), 
pp. 305-323. For a treatment of dissipative systems not involving the time explicitly, see his 
book Dynamical Systems, American Mathematical Society Colloquium Publications, vol. 9, 
1927, pp. 31-32. 


526 M. H. MARTIN 


acteristic strips of (4) are 


H,,, p’ kp, pr = H,, = + »,, 


(’=d/du). 
They possess the integral 


(9) p + ke + H(qr, ps) = const. 
By adjoining 


t=u, p= 2 +f (prH », + p)dt, 


to the equations of a motion of (3), one obtains a solution of (8). Taking po 
arbitrarily, zo is determined so that the constant in (9) equals zero, and the 
relation obtained is used to eliminate p from the last equation in (8) to obtain 


dz/dt + kz = p,H», — H, 
which, when integrated, yields (7). The theorem then follows from (9). 


Coro.iary. If H is homogeneous of degree two in q,, ps, @ quasi-conservative 
system has a first integral (H+ (k/2)p,q,r)e** =const. 


To establish the corollary, set 2H =q,H,,+-p,Hp, in (7) and substitute for 
H,,, Hp, from (3) to obtain 2e**S =e**p,q,+-const. 

The generalization of the principle of least action to quasi-conservative 
systems rests on the following lemma(’*). 


Lemma. The projections of the characteristic curves of the partial differential 
equation 
2 2 
2, = 0, = tay 0, det 0, 
upon the space of the independent variables x, are the extremals of the Mayer 
calculus of variations problem 5z,;=0, given that 
dz/dt = F(x», 2,%,), 62 =0, 8, = 0, 


where § —2= F(x,, 2, x,—&,) is the equation of the Monge cone for the partial 
differential equation at the point x,, z. 


THE PRINCIPLE OF LEAST ACTION. The projections of the characteristic curves 
of the partial differential equation kz + H(q,, ps) =0, Ps =2q, upon the space of the 
coordinates q, are the extremals of the Mayer calculus of variations problem 
62z:=0, given that 

= (2(c — bir, 82 =0, 89, = 0. 


(#°) For the proof of this lemma for the partial differential equation f(x, y, p, g) =0, see 
A. Kneser, Lehrbuch der Variationsrechnung, Braunschweig, 1925, pp. 157-160. 


[November 


1942] THE PROBLEM OF THREE BODIES 527 


To arrive at the principle, we set S=he-**+W, where h=const. and 
W=W(q,---, Qn) in (4) to obtain the partial differential equation 
kW+H(q,, W,,)=0. Placing z=W, p,=W,, and calculating the equation 
of the Monge cone for this partial differential equation, the truth of the 
principle follows from the lemma. 

To see why the principle is to be regarded as a generalization of the prin- 
ciple of least action for conservative systems, it will be recalled that placing 
S= —ht+ Win the Hamilton-Jacobi partial differential equation of a conserv- 
ative system leads to the partial differential equation —h+H(q,, W,,) =0. 
The projections upon the space of the coordinates g, of the characteristics of 
this partial differential equation are the extremals of the Mayer calculus of 
variations problem 6z'=0, given that 


= (2(c+ 82 =0, 89, =0. 


When this is formulated as an ordinary calculus of variations problem with 
fixed end points 


+ Wane)" + bie} dt = 0, 


one obtains the classic expression for the principle of least action. 

2. Characteristic exponents(!'). Assuming that the origin of the phase 
space is an equilibrium motion of (3), the equations of variation written in 
matrix form are 


(10) dx/dt = Ax, A= GH — kK. 

Here x denotes a matrix with 2” rows and one column, the first and last n 

rows being occupied by the variations of g,, p,., respectively, while(!*) 

@ Hye, @ 2 @ 


where o, ¢ denote the m Xn zero and unit matrices, respectively, and the par- 
tial derivatives are evaluated at the origin. Denoting the transposed matrix 
by affixing a prime, one verifies that 


G’=+G, GG’=GG=-G=E; 
K’'=K, GKG=K-E, 


where E is the unit matrix of 2” rows and columns. 


(11) 


(41) For the theorems on characteristic exponents of conservative systems, see G. D. Birk- 
hoff, Dynamical Systems, loc. cit., pp. 74-78, and A. Wintner, Three notes on characteristic expo- 
nents and equations of variation in celestial mechanics, American Journal of Mathematics, vol. 53 
(1931), p. 609. 

(2) The matrix M defined here will not be needed until Theorem 4. 


528 M. H. MARTIN “ [November 


THEOREM 3. The characteristic exponents may be divided into pairs in which 
the two members are of equal multiplicity and have —k for their sum. 


The characteristic exponents are the roots of the equation f(A) 
=det (A—\E) =0. Replacing A by its transposed, multiplying before and 
behind by G, it follows from (11) that f(A) =f(—A—&), and the theorem is 
proved. 

It may be noted in passing that at least half of the characteristic expo- 
nents have negative (positive) real parts if k>0 (k <0). 


THEOREM 4. If the matrix H+(k/2)M is definite, the characteristic expo- 
nents lie on the line R(A) = —k/2 with X= —k/2 excluded. 


Corresponding to a characteristic exponent \, there isa solution of (10) 
given by x =ae™, where a is a constant matrix satisfying (A—\ Z)a=0. Multi- 
plying on the left by a’G and observing that 2GK=M-+-G, it is found that 


(12) a'(H + (k/2)M)a = 0, 


inasmuch as G is skew-symmetric. If \ is real, a may be taken real and the 
assumption that H+(k/2)M is definite is contradicted by (12). Thus no char- 
acteristic exponent is real. 

We may regard (10) as the canonical equations of a quasi-conservative 
system whose Hamiltonian function H is homogeneous of degree 2 in its argu- 
ments and write the first integral, previously obtained in the corollary to 
Theorem 1, in the matrix form x’(H+(k/2)M)xe*‘=const. For complex \, 
the real solution x+x is formed and inserted in the energy integral. Keeping 
(12) in mind, one finds that a’/(H+(k/2)M)a& exp (k+A+A)t=const., which 
requires R(A) = —k/2.. 


III. NATURAL QUASI-CONSERVATIVE SYSTEMS 


A quasi-Lagrangian system (3) is a natural system if b,=0. Taking b,=0 
in (5), it follows from Theorem 1 that the w-limiting motions of a positively 
stable motion of a natural quasi-conservative system (3) are equilibrium mo- 
tions of (3). Such systems accordingly possess no periodic motions. 


THEOREM 5. To each characteristic constant 6, of the matrix | —Ca,q,||, evalu- 
ated at an equilibrium motion of (3), there correspond two characteristic expo- 
nents given by the roots of the equation 


+ kX + 5, = 0. 
Referring to §2 of II, it is apparent that 


od 
H ’ Qa ll, 8 Saal. 


and it may be verified that the transformation x= Ty, where 


| ; 
‘ 
4 
4 
2 . 


THE PROBLEM OF THREE BODIES 


, 
rare" 
ay’ 


carries (10) into dy/dt=By, with B=GT’HT-—kK. Choosing y so that 
y'a'y~=e, ~'6y=5 where 6 is a diagonal matrix with the characteristic 
constants of § along the principal diagonal, the theorem follows inasmuch as 
the characteristic exponents satisfy the equation det (B—) E) =0. 

It is clear that there are no purely imaginary characteristic exponents. 
If the characteristic exponents , have negative real parts and the 
remaining 2 —m have positive real parts, the equilibrium motion is of nega- 


tive or positive general type, according as none of the linear commensurability 
relations 


Pit + pm 2, 

Pomdom = Ary 
+ pam 2, 

in I or II is satisfied. 


(13) 


THEOREM 6. If an equilibrium motion Mo is of negative [positive] general 
type, a suitably restricted neighborhood of Mo contains an analytic m-dimensional 
[(2n —m)-dimensional | surface, the points of which tend towards My as a limit 
as t+ 0 [—]. No other points of this neighborhood tend towards My as a 
limit as t+ 0 [— ©]. The number m is the number of characteristic exponents 
of Mo with negative real parts(**). 


IV. THE RESTRICTED PROBLEM OF THREE BODIES 


Two finite masses 4 and 1—y move in a fixed plane under their mutual 
gravitational attraction. The restricted problem of three bodies is the problem 
of determining the motion in the fixed plane of a third mass (the infinitesimal 
mass) subject to the gravitational attraction of the two finite masses, the 
motion of which is assumed to proceed independently of its presence. The 
problem is said to be of elliptic, parabolic or hyperbolic type according as the 
orbit of w about 1—, is an ellipse, parabola or hyperbola. When the conic 
section dégenerates into a line segment the problem is termed rectilinear, 
otherwise we term the problem general. 

If we take the center of mass of the two finite masses as origin of a rec- 
tangular coordinate system (£,; 7) having a fixed orientation in the plane of 
motion of the two finite masses, it is known that 


(14) p’’ — = — = a = const., («? = gravitational constant), 


(48) A proof of this theorem has been given by the writer in Bulletin of the American 
Mathematical Society, vol. 46 (1940), pp. 475-481. See also C. L. Siegel, Der Dreserstoss, Annals 
of Mathematics, (2), vol. 42 (1941), pp. 156-165. 


1942] 529 
j=i,-++,m; 
k=m+1,---,2mn; 
= ; 


530 M. H. MARTIN ‘ [November 


where p, 0, respectively, denote the length and angle of inclination with the 
positive £-axis of the line joining the two finite masses, differentiation with 
respect to the time 7 being indicated by a prime. Denoting the distances of 
the infinitesimal mass from 1—y, uw by pi, p2, respectively, the Lagrangian 
function A for its motion is 


A = + + — + u/p:). 
THEOREM 7. The introduction of new variables x, y, t by 
(15) E+ in = pe#(x+ iy),  xdr = 0, 


reduces the restricted problem of three bodies to a quasi-Lagrangian system in 


which 
(16) L = + y*)/2 + be“"(xy — yk) + Q, b= a/k, 
H = [(p + be-'y)? + (q — be-'x)*]/2 — 9, = (log p)/2, 


where 


Q = (x? + y*)/2 + (1 — + 
n=(x+u) +y, +y, 


(17) 


and differentiation with respect to t is indicated by a dot. The system is acquisitive 
or dissipative according as the two finite masses approach or leave each other 
and reduces to a natural system for rectilinear problems. 


It may be verified that 
9’? = + + 2p76"(xy’ — yx’) 
+ + + 9?) + + yy’), 
[op’(a® + y*)]’ = (0? + pp’’)(x® + y*) + + yy’). 


When these equations are subtracted with (14) in mind, the result substituted 
in A, and the term [pp’(x?+-y?) |’ suppressed("), it is found that 


A = (p?/2)(x!? + + aay’ — ya’) + (x*/p)Q, 


provided one observes that pri, p2= pfs. 

Writing the equations of motion for the infinitesimal mass in the varia- 
tional form 6/7:Adr =0 and introducing the new independent variable ¢ de- 
fined in (15), this variational equation takes the form (2) with /, L defined as 
in (16). 

It will be observed that the Hamiltonian function H has the first form in 
(5). The system is therefore acquisitive or dissipative as stated. 


(“) This may be done since the variation of its integral vanishes. 


‘ i 
4 


1942] THE PROBLEM OF THREE BODIES 531 


Remark. If the new independent variable is defined by xdr = pdi the restricted 
problem of three bodies reduces to a quasi-Lagrangian system with 


(18) L = + + — yk) + b = a/k, 
H = [(p + be-'y)? + (q — be'x)?]/2 — 1 = log p, 
where the dot denotes differentiation with respect to t. The Hamiltonian function 


has the second form in (5), the system being acquisitive or dissipative according 


as the finite masses approach or leave each other, and is a natural system for 
rectilinear problems. 


V. THE RESTRICTED PROBLEM OF PARABOLIC TYPE 


1. The differential equations of motion. If 1 moves in a parabolic orbit 
about 1—uy, it is known("*) that 


(19) p= (p/2) sec? (6/2), 2xr = p*/2((1/3) tan® (6/2) + tan (6/2)), p > 0, 


where p is the semi-latus rectum of the parabolic orbit. In case p=0, the 
parabolic orbit degenerates into a line segment, and we have 


(20) p = (3x7/21/2)2/8, 
When the variable ¢ of Theorem 7 is introduced in place of 7, these equations 
are replaced by 
(21) p = (p/2) cosh? (¢/2'/?), sin (0/2) = tanh (¢/2'/2), 
(22) p = 
Thus, as ¢ ranges from — © to + ©, the mass yu describes the complete para- 
bolic orbit (p>0) or describes the line segment (p = 0), approaching a collision 
with 1—yast—+ ~ according as the negative or positive sign is taken in (22). 
Corresponding to (21), (22) the function / in (16) is given by 
= (1/2) log (p/2) + log cosh (#/2'/2), b= + 

from which the differential equations (1) are found, on setting c=b(2/p)*/? 
(23) + (1/2'/*) tanh (¢/2'/2)-% — 2c sech (¢/2'/2)-y = Q,, 

+ (1/21?) tanh (¢/2'/2)-y + 2c sech (¢/2'/2)-% = Q,, 
the latter equations holding for the rectilinear problem, with the positive or 


negative sign taken according as the finite masses leave or approach each 
other. The canonical form (3) of (23) is 


(25) = Hy, = Hz, p= — Hz — kp, g = — H, — kq, 


(*) See, for example, F. R. Moulton, An Introduction to Celestial Mechanics, New York, 
1935, pp. 155-159. 


532 M. H. MARTIN * [November 


where k=(1/2"/?) tanh (¢/2'/*) and H is defined in (16), while that of (24) is 
t= H,, y= H,, p= — H, — kp, = — H, — ka, 
k= + 1/202, 


Corresponding to the positive and negative values for k we have two kinds 
of rectilinear problems; the former is termed the dissipative rectilinear problem 
and the latter the acquisitive rectilinear problem. . 

2. The flow in the phase space. A comparison of (25) and (26) shows that 
the a and w-limiting motions in the general problem are, respectively, motions 
of the acquisitive and dissipative rectilinear problems. 

The:set S of singularities of the system (25) comprises the points of the 
two-dimensional surfaces x = —y, y=0; x =1—y, y=0 corresponding to colli- 
sions of the infinitesimal mass with one or the other of the finite masses. 


(26) 


THEOREM 8. If the energy H of a motion tends to — ~ as t+ © the motion 
is positively instable and the point P in the (x, y)-plane corresponding to the 
infinitesimal mass either tends uniformly towards one of the points corresponding 


to the finite masses or else tends uniformly towards the point at infinity as 


to+o., 


Suppose that P tends uniformly to neither of the points corresponding to 
the finite masses nor to the point at infinity as ++ ©. There would exist a 


positive ¢ and an infinite sequence {/,} of t-values for which P would lie in 
the region 


(x+y)? + y? >e, x? + y? > 1/e, 


and from the definitions of H, H in (16), (26) there would exist a constant K 
such that H(t,) >K, thus contradicting the hypothesis. 
The equilibrium motions of the rectilinear problems are characterized by 


p = 0, qg= 9, 2, = 0, Q, = 0, 


the latter two equations being satisfied at exactly five points L; in the (x, y)- 
plane known as libration points. Li, Le, Ls lie on the x-axis with Li between, 
L, to the right of, and L; to the left of the points corresponding to the finite 
masses which in turn form equilateral triangles with L4, Ls. Corresponding to 
L; there are five equilibrium motions £; in the phase space. 


THEOREM 9. If the point P representing the infinitesimal mass remains in a 
bounded closed region of the (x, y)-plane containing L; and not containing the 


points corresponding to the finite masses, it tends uniformly towards a definite L; 
as t+, 


In. view of the restriction upon P it is clear that — Q has a finite lower 
bound. Since H decreases monotonely for sufficiently large #, it follows that 


| 
| 
| 
in 


1942] THE PROBLEM OF THREE BODIES 533 


p?+q* remains bounded for t>to. The corresponding motion M in the phase 
space is accordingly stable and it follows from Theorem 1 that the w-limiting 
motions of M are the equilibrium motions E; of (26). Now M approaches its 
set of w-limiting motions uniformly and therefore it approaches a definite E; 
uniformly as t+ ©. 

The proof of the theorem has been given only for the general problem. It 
is evident that it remains in force for the rectilinear problem and when 
— 

Since a stable motion has one £; as a unique a-limiting point and one as a 
unique w-limiting point, the stable motions may be divided into twenty-five 
classes, inasmuch as the a- and a points conceivably may be chosen 
at random from the five E;. 

3. The rectilinear problem. The differential equations (24) for the acquisi- 
tive and dissipative rectilinear problems interchange when ¢ is replaced by —t. 
It will be sufficient, therefore, to consider one of these problems. We shall 
select the dissipative problem for further investigation. 


THEOREM 10. If stable motions other than equilibrium motions exist in the 
dissipative rectilinear problem, they fall into the following nine classes: 

(i) a-limiting point E, or Ez, w-limiting point Ey, Es, Es; 

(ii) a-limiting point Es, w-limiting point Ey, Es; 

(iii) a-limiting point w-limiting point Ey; 
provided 0<p<1/2. When p=1/2 there is no motion in (ii) with E, as w-limit 


point and when 1/2<y<1, the roles of Ex, Es are interchanged. 


Along a motion M of (26) not an equilibrium motion H decreases mono- 
tonely with increasing ¢ for all ¢. Therefore an E; cannot serve simultaneously 
as a- and w-limiting point for M, since this would require H constant along M 
and dH /dt = —(p?+q?)/2/2=0 would imply that M is an equilibrium motion, 
contrary to hypothesis. 

If Q; is the value of Q at Li, it is known("*) that Q is greater than, equal 
to, or less than Q; according as yp is less than, equal to, or greater than 1/2 
and that 2; exceeds while 2,= 9; is less than either of 2, 23. Therefore, if H; 
indicates the value of H at E; it follows that H; is less than, equal to, or 
greater than H; according as yu is less than, equal to, or greater than 1/2 and 
that is less than while is greater than either or Hs. 

It is therefore impossible for a stable motion to have one Ey, Es as a-limit- 
ing point and the other for w-limiting point and the theorem is an immediate 
consequence of the above inequalities and the monotone character of H. 

A study of the nature of the flow in the phasé space in the neighborhoods 
of E; leads to a sharper classification of stable motions. Prior to such study a 
lemma dealing with the characteristic exponents of the E; is needed. 


(*) See A. Wintner, The Analytical Foundations of Celestial Mechanics, Princeton, 1941, 
pp. 364-366. 


534 M. H. MARTIN . [November 


LEMMA. There is one positive and one negative characteristic exponent for Ex; 
the two remaining are conjugate complex numbers with negative real parts. There 
1s one positive characteristic exponent for Ex; the remaining three have negative 
real parts and there exists a constant u* such that of these three: one is real and 
the other two are conjugate complex numbers for 0<y<p*; all are real with two 
equal to —2-*!? and differing from the third if w=", all are real and different 
for u* <p<1. At Es the situation is analogous to that at E, and at Ey, Eg there 
are two distinct positive and two distinct negative characteristic exponents. 


Consider the three libration points L; (a;, 0) on the x-axis. One finds that 
(27) 0) = 1 + 2,,(a;, 0) = 0, =1— 


where 


If we use (27) to compute the characteristic constants of the matrix | —Caal| 
in Theorem 5, and take k=1/2'/? in this theorem, the characteristic expo- 
nents A of E; turn out to be 


= 2-4/2(— 1 + (9+ 16A,)"), 1 + (9 — 


from which the statements of the theorem relative to the characteristic ex- 
ponents at Ei, Ez, Es follow, inasmuch as it may be proved that A1>4 and 
Az (A;) decreases (increases) monotonely and continuously from 4 (1) to 1 (4) 
as p Varies from 0 to 1. 

After computing the partial derivatives of second order for 2 at Ly, Ls 
and obtaining the characteristic constants of ||—c,,,,||, it is found that the 
characteristic exponents \ of Ey, Es are given by 


= 1 + (13  12(1 — 3u(1 — 


from which follow the properties for the characteristic exponents of Ey, Es. 

In view of this lemma, Theorem 6 (n =2, m =3) applies to E; for all values 
of uw, to if uxu*, to Es provided and (m=2) to Ey, Es for all 
values of u. E:, Ex, Es are of positive general type, since none of the linear 
commensurability relations II of (13) is fulfilled. 


A; 


THEOREM 11. In a suitably restricted neighborhood of an E; (i=1, 2, 3) of 
negative general type the locus of points which lie on positively [negatively] stable 
motions having E, as a unique w-[a-] limiting point is an analytic hypersurface 
[curve]. In a suitably restricted neighborhood of an E; (i=4, 5) of negative [posi- 
tive] general type the locus of points which lie on positively [negatively] stable 
motions having E; for a unique w- [a-] limiting point is an analytic two-dimen- 
sional manifold. 


The number of classes of stable motions may now be reduced from nine 
to six. 


1942] THE PROBLEM OF THREE BODIES 535 


THEOREM 12. Excluding the values u*, 1—* of u, if stable motions other than 
equilibrium motions exist, the a-limiting point is E, or Es and the w-limiting 
point is one of Ey, Ex, Es. 


To prove the theorem we show that there are no stable motions in the 
classes (ii), (iii) of Theorem 10. 

The motions in the phase space for which the infinitesimal mass remains 
on the line joining the finite masses lie in the (x, p)-plane and are solutions 
of the differential system 


(28) £=p, p=, — p/22, 


The flow in the (x, ~)-plane has three equilibrium motions (a;, 0) 
(t=1, 2, 3) separated by the points (—y, 0), (1—y, 0) corresponding to the 
finite masses. One characteristic constant is positive and the other negative 
at each equilibrium motion. It follows from Theorem 6 that the locus of 
points in a sufficiently small neighborhood of (a,, 0) lying upon motions of 
(28) having (a;, 0) as a unique a-limiting point is an analytic curve(*’) through 
(a;, 0). These equilibrium motions appear in the phase space of (26) as E, 
(i=1, 2, 3), and since the motions of (26) having E, as a unique a-limiting 
point are confined to an analytic curve('’) through E,, they must lie entirely 
in the (x, p)-plane. 

A stable motion of (26) with one Zi, Ex, Zs as a-limiting point and another 
as w-limiting point is therefore impossible, for such a stable motion would re- 
quire the infinitesimal mass to collide with one of the finite masses. 

4. The existence of stable motions. For Theorem 12 to have content a 
demonstration of the existence of stable motions other than equilibrium mo- 
tions is essential. We shall show that two stable motions exist when p= 1/2. 
Whether other stable motions exist for y= 1/2, or whether any stable motions 
exist when 41/2, are unsolved problems. 

Setting u=1/2, x =p=0 in (26) we obtain a differential system of the sec- 
ond order 


G= oy) — 9/2", = — (1/4) + 


for the motions in the phase space when the infinitesimal mass is restricted to 
lie on the perpendicular bisector of the line segment connecting the two finite 
masses. 

Since ¢ is an odd function of y, the motions in the (y, q)-plane are paired, 
the members of a pair being symmetric to each other with respect to the 
origin. The graph G of g=2/% is indicated by ABCOC’B’A’ in Figure 1. It 
rises monotonely along ABC, C’B’A’ and falls monotonely along COC’. The 
flow proceeds to the right [left] in the upper [lower] half-plane, being vertical 
on the y-axis, and is directed downwards [upwards] in the region above [un- 


(#7) The curve has no multiple point at the equilibrium motion. 


536 M. H. MARTIN . [November 


derneath] G, being horizontal on G, except for the equilibrium motions 
B, O, B’ corresponding to Es, E:, Es. 

The characteristic exponents of O are congugate complex numbers with 
negative real parts. It follows that, sufficiently near to O, the motions have a 
spiral character(!*) about O, that is, as + © the point P on the motions 


tends towards O with the angle between OP and the y-axis increasing in- 
definitely. 


AQ 


Fie. 1. 


At B’ one characteristic exponent is positive and the other is negative. 
Through B’ there accordingly pass two analytic arcs K’BL’, M’BN’, the loci 
of points lying on motions having B’ for unique a-, w-limiting point, respec- 
tively. Upon calculation it is found that these arcs are disposed with respect 
to G as shown in Figure 1. 

Consider the arc B’L’ which in the immediate neighborhood of B’ lies 
in the open region B’OC’B’. In this region the flow is directed to the left and 
downwards, and since the region contains no equilibrium motion upon which 
the arc could end, it leaves the region by way of a point R/ on the arc OC’ 
to enter the open region in the lower half-plane under G. Here the flow is up- 
wards and to the left and, since the region contains no equilibrium points, 
the arc B’L’R{ must leave it by: (a) the open arc Ri O, (b) O, (c) the open 
segment OB of the y-axis, (d) B, (e) the open arc BA. Clearly (a) is impossible, 
since on Rj O the flow is horizontal.and directed from right to left. We may 


(38) See, for example, L. Bierberbach, Differentialgleichungen, Berlin, 1930, pp. 104-105. 


4. 
- 
~ 
A 
- 
| Cc 
| 
| 
| 
| 


1942] THE PROBLEM OF THREE BODIES 537 


exclude (b) in view of the spiral character of the flow about O. The possibility 
(c) is illustrated in Figure 1 by B’L’R{ Ri. Paired with such a motion, there 
is a motion on the arc BLR,R: symmetric to it with respect to O. The two arcs 
BLR,R:2, B'L’Ri Ri and the segments BR? , B’R; of the y-axis enclose a region 
containing O into which the motion on B’L’R{ R/ enters and never leaves. 
It cannot leave by way of the open segments BR/ , B’Rs, for on the former the 
flow is vertically upwards and on the latter vertically downwards. Departure 
by way of B or B’ is ruled out by Theorem 10 and it cannot meet either 
BLR;R; or B’L'R{ Ri for stich a point of meeting can occur only at an equi- 
librium motion. The motion accordingly remains in this region and is there- 
fore stable. Paired with it, there exists a second stable motion. (d) is 
impossible by Theorem 10. 


Fic. 2. 


The theorem is accordingly proved, provided we can rule out the possibil- 
ity (e) pictured in Figure 2 by B’L’R; R:. Paired with a motion on this arc, 
there is a motion symmetric to it with respect to O on the arc BLR,R:. These 
two arcs taken with B’R:, BR/ bound a region containing O and the motion 
M’B’ in its entirety. The motion M’B’ would possess O or B as a-limiting 
point and B’ as w-limiting point, thus contradicting Theorem 10. Hence (e) 
is impossible. 


VI. THE RESTRICTED PROBLEMS OF ELLIPTIC 
AND HYPERBOLIC TYPE 


1. The differential equations of motion. If the mass u moves in an elliptic 


A 
R, 
‘ 
A 


538 M. H. MARTIN 


[hyperbolic] orbit about the mass 1—y, it is known(!*) that 
p = a(1 — ecos 9), — esind) = xr, 
[o =a(e cosh y — 1), sinh — y) = xr]. 
The independent variable /in IV turns out to be-proportional to ¢ [vy] and we 
p = a(1 — e cos #/a'/2), [o = a(e cosh (i/a'/?) — 1)], 
so that 
1 = log a(1 — e cos #/a'/’), [2 = log a(e cosh (i/a'/*) — 1)], 
e sin E R e sinh #/a'/? 
a'/2(1 — ¢ cos a'/2(e cosh (i/a!/*) — 1)’ 
are to be taken in the canonical equations (25) with H defined as in (18) and 
the dot denoting differentiation with respect to /. 


For rectilinear problems the independent variable ¢ in IV is employed to 
yield 


(29) 


p = 2a sech? ¢/2'/2, [o = 2a csch? ¢/2"/2], 
k = — (1/2"/?) tanh ¢/2!/2, [k = — (1/2"/?) coth ¢/2*/2]. 
The differential equations (1) take the simple form 

_&+ kt = ky = Q,. 
It is readily verified that the differential equations for the two types inter- 
change when ¢ is replaced by t+(2/2"/*)t. One obtains the canonical form 
(25) for these equations by taking the above values of k and placing )=0 in 
the definition of H in (16). 


2. Limiting motions. In the general problem of hyperbolic type e>1 and 
the w-limiting motions are the motions 


= xo + — eH), = + — 

q = qoe 

The w-limiting motions of a positively stable motion are equilibrium motions 
in (30). However since these equilibrium motions do not form a discrete set, 
it cannot be assumed that the motion tends uniformly towards a definite 
equilibrium motion, for it is conceivable that the motion tends toward a set 
of equilibrium motions. 

For rectilinear problems of elliptic or hyperbolic type, it follows from (29) 
that the a- and w-limiting motions are, respectively, motions of the dissipative 
and acquisitive rectilinear problems of parabolic type. A positively [nega- 
tively] stable motion approaches one of the equilibrium motions E; of the 
rectilinear problem of parabolic type uniformly as t+ © [— ©]. It will be 
observed that as t+ + © the finite masses tend towards collision. 


(#*) See F. R. Moulton, op. cit., pp. 158-159 and pp. 177-178. 


UNIVERSITY OF MARYLAND, 
Park, Mp. 


(30) 


pave 
4 
J 
it 


| 


